topmodel is a service for evaluating binary classifiers. It comes with built-in metrics and comparisons so that you don't have to build your own from scratch.
You can store your data either locally or in S3.
Here are the graphs topmodel will give you for any binary classifier:
We also use bootstrapping to show the uncertainty on ROC curves and precision/recall curves. Here's an example:
The idea here is that among all items with score 0.9, you expect 90% of them to be in the target group (marked 'True'). This graph compares the expected rate to the actual rate -- the closer it is to a straight line, the better.
There are two graphs that show score distributions for instances labelled 'True' and instances labelled 'False'. The first graph shows the log distribution of the scores:
And the second shows the absolute frequencies:
topmodel comes with example data so you can try it out right away. Here's how:
-
Create a virtualenv
-
Install the requirements:
pip install -r requirements.txt
-
Start a topmodel server:
./topmodel_server.py
-
topmodel should now be running at http://localhost:9191.
-
See a page of metrics for some example data at http://localhost:9191/model/data/test/my_model_name/
You can now add new models for evaluation! (see "How to add a model to topmodel" below for more)
It's better to store your model data in a S3 bucket, so that you don't lose it. To get this working:
Create a config.yaml
file:
cp config_example.yaml config.yaml
and fill it in with the S3 bucket you want to use and your AWS secret key and access key. topmodel will automatically find models in the bucket as long as they're named correctly (see "How to add a model to topmodel")
Then start topmodel with
./topmodel_server.py --remote
-
Create a TSV with columns 'pred_score' and 'actual'. Save it to
your_model_name.tsv
. The columns should be separated by tabs. In each row:actual
should be 0 or 1 (True/False also work)pred_score
should be the score the model determined.weight
is an optional third column if you want to weight different instances more or less (default is 1).- See the examples in
example_data/
- For example:
actual pred_score False 0.2 True 0.8 True 0.7 False 0.3
-
Copy the TSV to S3 at
s3://your-s3-bucket/your_model_name/scores.tsv
, or locally todata/your_model_name/scores.tsv
-
You're done! Your model should appear at http://localhost:9191/ if you reload.
We'd love for you to contribute. If you run topmodel with
./topmodel_server.py --development
it will autoreload code.
There's example data to test on in data/test
.
- Julia Evans http://twitter.com/b0rk
- Chris Wu http://github.com/cwu
- George Hotz http://geohot.com
Copyright 2014 Stripe, Inc
Licensed under the MIT license.