Machine Learning

Loading the Model (and Complaining about Memory Usage)

Kevin Fang

09 May 2017 — 1 min read

How I loaded the files in python

I used joblib to save the files

from sklearn.externals import joblib feature_list = count_vect.get_feature_names() model = "model105" joblib.dump(feature_list, model + '_vocabulary.pkl') joblib.dump(tfidf_transformer, model + '_transform.pkl') joblib.dump(clf, model + '.pkl', compress=9)

and joblib to load the model to the memory

class LemmaTokenizer(object): def __init__(self): self.wnl = WordNetLemmatizer() def __call__(self, doc): return [self.wnl.lemmatize(t) for t in word_tokenize(doc)] count_vect = CountVectorizer(tokenizer=LemmaTokenizer(),vocabulary=joblib.load('model105_vocabulary.pkl')) tfidf_transformer = joblib.load('model105_transform.pkl') clf = joblib.load('model105.pkl') clf.densify()

I used LemmaTokenizer to parse the text (it's a stemmer).

Memory Usage

Turns out that the model (300MB) when loaded into memory was around 800MB of RAM. Which meant that Heroku couldn't host because the slug size is too big and the used RAM is way too big. Very frustrating. Which means that only way to host this would be on a server on AWS, Google Cloud or Digital Ocean.

I decided to wrap the model around a django website and walla, I'm done!

Cheering the NYC Marathon

NYC marathon day is exciting for all true New Yorkers, whether you are racing or not. It’s crazy to imagine (and then see only a portion of) 50k people run across all 5 boroughs. I watched the birds-eye view of the race track and it was already so long.

Why I don’t plan on running the NYC marathon

When meeting new people, my conversations inevitably shift to the NYC marathon. Running is a common hobby for those looking to be or staying in shape, those looking for new friends, or looking for a quest outside of work. It’s a broadly relatable activity and can spark many follow-up

What to do in Austin, Texas?

I love going back home to Austin. It's a fantastic city with many things to do. Although I spend most of my time with family, I occasionally have the opportunity to go to my favorite nostalgic places. If you told me you were visiting Austin right now, here

From photos to video diaries: how I remember the past

One challenging question I've always had was how to remember the past. I know that my memory isn't the best, and when my friends and I reminisce about the good ol' days, our accounts begin to differ. Who knows what another few years will make

Memory Usage

Read more

Cheering the NYC Marathon

Why I don’t plan on running the NYC marathon

What to do in Austin, Texas?

From photos to video diaries: how I remember the past