This repo contains AllenNLP model for prediction of Named Entity categories by its mentions.
You can generate some fake data using this Notebook
Filtered OneShotWikilinks dataset with manually selected categories.
- Crete category graph build_category_graph.ipynb
- Produces:
category_graph.pkl
- Produces:
- Obtain the list of Person articles from Ontology obtain_people_articles.ipynb:
- Requires:
dbpedia_2016-10.owl
- Produces:
people_categories.json
- Requires:
- Build mapping from article to people categories generate_full_people_categories.ipynb. Requires
people_categories.json
category_graph.pkl
projects/categories_prediction/manual_categories.gsheet
- Filter mentions for people filter_mentions.ipynb.
- Requires:
people_all_categories.json
- Produces:
people_mentions.tsv
- Requires:
Prepare splitted data with:
!split -n l/10 --verbose ../data/fake_data_train.tsv ../data/fake_data_train.tsv_
pip install -r requirements.txt
rm -rf ./data/vocabulary ; allennlp make-vocab -s ./data/ allen_conf_vocab.json --include-package category_prediction
allennlp train -f -s data/stats allen_conf.json --include-package category_prediction
allennlp train -f -s data/stats allen_conf.json --include-package category_prediction -o '{"trainer": {"cuda_device": 0}}'
rm -rf data/stats2/ # Clear new serialization dir
allennlp fine-tune -s data/stats2/ -c allen_conf.json -m ./data/stats/model.tar.gz --include-package category_prediction -o '{"trainer": {"cuda_device": 0}, "iterator": {"base_iterator": {"batch_size": 64}}}'
allennlp evaluate ./data/stats/model.tar.gz ./data/fake_data_test.tsv --include-package category_prediction
MODEL=./data/trained_models/6th_augmented/model.tar.gz python run_server.py
gunicorn -c gunicorn_config.py wsgi:application
Build
cd docker
docker build --tag mention .
Run with passing pyenv into container
docker run --rm --restart unless-stopped -v $HOME:$HOME -p 8000:8000 \
-v $HOME/.pyenv:/root/.pyenv \
-e ENV_PATH=$HOME/virtualenv/path \
-e APP_PATH=$HOME/project/root/path mention
Fix 100% GPU utilization
sudo nvidia-smi -pm 1