ideas

Finding best fasttext hyperparameters

If you check fasttext info page, you will see fasttext has a lot of different input parameters for training and also dictionary. If you ever tried to tune your model accuracy, you would see that changing these parameters changes model's precision and recall dramatically. So I decided to make a

Trying fasttext classifier models with different corpus

After making some experiments with using stackoverflow data, I wonder how these models work with different corpus. Is it a good idea to predict tags from body with a model which is trained by titles? I used models from this post and I used a simple methodology for this experiment:

Fasttext classifier for stackoverflow data

I think I'm so obsessed with open data and open data processing. I think each website, which is driven by user's content (Like wikipedia, twitter, tumblr, facebook) needs to provide API or data dump for community to use these information. It's a fair trade right, people provides information and then