Download file

Scope And Content

Folder for the capstone-2023 repository. SQL files for creating DB tables. Copy of README. Notebooks for data extraction, cleaning, exploraion, and model creation. Dashboard files for building and hosting a local flask server.

Technical Details

name: group6b channels: - defaults dependencies: - asttokens=2.0.5=pyhd3eb1b0_0 - backcall=0.2.0=pyhd3eb1b0_0 - ca-certificates=2022.07.19=haa95532_0 - certifi=2022.9.14=py39haa95532_0 - colorama=0.4.5=py39haa95532_0 - debugpy=1.5.1=py39hd77b12b_0 - decorator=5.1.1=pyhd3eb1b0_0 - entrypoints=0.4=py39haa95532_0 - executing=0.8.3=pyhd3eb1b0_0 - ipykernel=6.15.2=py39haa95532_0 - ipython=8.4.0=py39haa95532_0 - jedi=0.18.1=py39haa95532_1 - libsodium=1.0.18=h62dcd97_0 - matplotlib-inline=0.1.6=py39haa95532_0 - nest-asyncio=1.5.5=py39haa95532_0 - openssl=1.1.1q=h2bbff1b_0 - packaging=21.3=pyhd3eb1b0_0 - parso=0.8.3=pyhd3eb1b0_0 - pickleshare=0.7.5=pyhd3eb1b0_1003 - pip=22.1.2=py39haa95532_0 - psutil=5.9.0=py39h2bbff1b_0 - pure_eval=0.2.2=pyhd3eb1b0_0 - pygments=2.11.2=pyhd3eb1b0_0 - pyparsing=3.0.9=py39haa95532_0 - python=3.9.13=h6244533_1 - python-dateutil=2.8.2=pyhd3eb1b0_0 - pywin32=302=py39h2bbff1b_2 - setuptools=63.4.1=py39haa95532_0 - six=1.16.0=pyhd3eb1b0_1 - sqlite=3.39.3=h2bbff1b_0 - stack_data=0.2.0=pyhd3eb1b0_0 - tornado=6.2=py39h2bbff1b_0 - tzdata=2022c=h04d1e81_0 - vc=14.2=h21ff451_1 - vs2015_runtime=14.27.29016=h5e58377_2 - wcwidth=0.2.5=pyhd3eb1b0_0 - wheel=0.37.1=pyhd3eb1b0_0 - wincertstore=0.2=py39haa95532_2 - zeromq=4.3.4=hd77b12b_0 - spacy-model-en_core_web_sm - pip: - affinegap==1.12 - aiodns==3.0.0 - aiohttp==3.8.3 - aiohttp-retry==2.8.3 - aiosignal==1.3.1 - aniso8601==9.0.1 - ansi2html==1.8.0 - anyio==3.6.2 - argon2-cffi==21.3.0 - argon2-cffi-bindings==21.2.0 - arrow==1.2.3 - async-timeout==4.0.2 - attrs==22.2.0 - beautifulsoup4==4.11.1 - bleach==6.0.0 - blis==0.7.9 - bpemb==0.3.4 - brotli==1.0.9 - btrees==4.10.1 - cachetools==5.2.0 - catalogue==2.0.8 - categorical-distance==1.9 - cchardet==2.1.7 - cffi==1.15.1 - charset-normalizer==2.1.1 - click==8.1.3 - click-plugins==1.1.1 - cligj==0.7.2 - cloudpickle==2.2.0 - confection==0.0.3 - conllu==4.5.2 - contourpy==1.0.5 - cycler==0.11.0 - cymem==2.0.7 - cython==0.29.28 - dash==2.8.1 - dash-core-components==2.0.0 - dash-html-components==2.0.0 - dash-table==5.0.0 - datetime-distance==0.1.3 - dedupe - dedupe-variable-datetime - defusedxml==0.7.1 - deprecated==1.2.13 - doublemetaphone==1.1 - emoji==2.2.0 - fastjsonschema==2.16.2 - filelock==3.9.0 - fiona==1.9.1 - flair==0.11.3 - flask - flask-cors - flask-restful - fonttools==4.37.4 - fqdn==1.5.1 - frozenlist==1.3.3 - ftfy==6.1.1 - funcy==1.17 - future==0.18.2 - gdown==4.4.0 - gensim==4.2.0 - geopandas==0.12.2 - google-api-core==2.11.0 - google-api-python-client==2.71.0 - google-auth==2.15.0 - google-auth-httplib2==0.1.0 - google-auth-oauthlib==0.8.0 - googleapis-common-protos==1.57.1 - greenlet==1.1.3.post0 - haversine==2.7.0 - highered==0.2.1 - httplib2==0.21.0 - huggingface-hub==0.12.1 - hyperopt==0.2.7 - icecream==2.1.3 - idna==3.4 - igraph==0.10.4 - importlib-metadata - ipython-genutils==0.2.0 - ipywidgets==8.0.4 - isodate==0.6.1 - isoduration==20.11.0 - itsdangerous==2.1.2 - janome==0.4.2 - jinja2==3.1.2 - joblib==1.2.0 - jsonpath-ng==1.5.3 - jsonpointer==2.3 - jsonschema==4.17.3 - jupyter==1.0.0 - jupyter-client==8.0.2 - jupyter-console==6.5.0 - jupyter-core==5.2.0 - jupyter-dash==0.4.2 - jupyter-events==0.6.3 - jupyter-server==2.2.1 - jupyter-server-terminals==0.4.4 - jupyterlab-pygments==0.2.2 - jupyterlab-widgets==3.0.5 - kiwisolver==1.4.4 - konoha - langcodes==3.3.0 - langdetect==1.0.9 - python-levenshtein - lxml==4.9.2 - markupsafe==2.1.1 - matplotlib==3.6.1 - mistune==2.0.5 - more-itertools==9.1.0 - mpld3==0.3 - multidict==6.0.4 - munch==2.5.0 - murmurhash==1.0.9 - mwparserfromhell==0.6.4 - nbclassic==0.5.1 - nbclient==0.7.2 - nbconvert==7.2.9 - nbformat==5.7.3 - neo4j==5.3.0 - networkx==2.8.8 - nltk==3.7 - notebook==6.5.2 - notebook-shim==0.2.2 - numexpr==2.8.4 - numpy==1.23.3 - oauthlib==3.2.2 - overrides==3.1.0 - pandas==1.5.2 - pandocfilters==1.5.0 - pathy==0.6.2 - patsy==0.5.3 - persistent==4.9.1 - pillow==9.2.0 - platformdirs==3.0.0 - plotly==5.13.0 - plotly-geo==1.0.0 - ply==3.11 - pptree==3.1 - preshed==3.0.8 - prometheus-client==0.16.0 - prompt-toolkit==3.0.36 - protobuf==4.21.9 - psycopg2==2.9.5 - py-entitymatching==0.4.0 - py-stringmatching==0.4.2 - py-stringsimjoin==0.3.2 - py4j== - pyasn1==0.4.8 - pyasn1-modules==0.2.8 - pycares==4.3.0 - pycparser==2.21 - pydantic==1.10.2 - pyhacrf-datamade==0.2.6 - pylbfgs== - pyldavis==3.3.1 - pyprind==2.11.3 - pyproj==3.4.1 - pyqt5==5.15.7 - pyqt5-qt5==5.15.2 - pyqt5-sip==12.11.0 - pyrsistent==0.19.3 - pysocks==1.7.1 - pytextrank==3.2.4 - graphviz - python-igraph==0.10.4 - python-json-logger==2.0.4 - python-slugify==7.0.0 - pytz==2022.2.1 - pywikibot==8.0.0 - pywinpty==2.0.10 - pyyaml==6.0 - pyzmq==25.0.0 - qtconsole==5.4.0 - qtpy==2.3.0 - regex==2022.10.31 - requests==2.28.1 - requests-oauthlib==1.3.1 - retrying==1.3.4 - rfc3339-validator==0.1.4 - rfc3986-validator==0.1.1 - rsa==4.9 - scikit-learn==1.1.2 - scipy==1.9.1 - segtok==1.5.11 - send2trash==1.8.0 - sentencepiece==0.1.95 - shapely==2.0.1 - simplecosine==1.2 - sklearn==0.0 - skops==0.6.0 - smart-open==5.2.1 - sniffio==1.3.0 - soupsieve==2.3.2.post1 - spacy==3.4.2 - spacy-legacy==3.0.10 - spacy-loggers==1.0.3 - sqlalchemy==1.4.42 - sqlitedict==2.1.0 - srsly==2.4.5 - stanza==1.4.2 - statsmodels==0.13.5 - tabulate==0.9.0 - tenacity==8.2.1 - terminado==0.17.1 - text-unidecode==1.3 - textblob==0.17.1 - texttable==1.6.7 - thinc==8.1.5 - threadpoolctl==3.1.0 - tinycss2==1.2.1 - tokenizers==0.13.2 - torch==1.13.0 - tqdm==4.64.1 - traitlets==5.9.0 - transformers==4.26.1 - typer==0.4.2 - typing-extensions==4.4.0 - unidecode==0.4.16 - uri-template==1.2.0 - uritemplate==4.1.1 - urllib3==1.26.12 - wasabi==0.10.1 - webcolors==1.12 - webencodings==0.5.1 - websocket-client==1.5.1 - werkzeug==2.2.2 - wget==3.2 - widgetsnbextension==4.0.5 - wikidata==0.7.0 - wikipedia==1.4.0 - wikipedia-api==0.5.8 - wrapt==1.15.0 - yarl==1.8.2 - zipp==3.13.0 - zope-index==5.2.1 - zope-interface==5.5.0


OpenCritic API extraction requires the Mega Subscription ($50/mo) to function properly. PSQL database must be setup & access credentials input into repo/files/client_id.json alongside the API key. The model learner used in the dashboard must be selected in client_id.json. OpenCritic API T&A Article 3 may subject project to teardown. Currently have written email approval to use the data to, "predict sales for individual games.

Scope And Content

Data extracted from HTTP queries to the data sources with some cleaning. It should contain the raw folder for folders for OpenCritic, Steam, and SteamSpy, and the zipped folder must replace the folder of the same name in the repository under repo/files/raw/. This can be done in place of extracting data from the API.

Scope And Content

Contains the models folder for skops dumps of hyperparameter-tuned models for hyperparameter tuning, which serve to cache them for re-runs on the tuning script to retrieve scores and explore the models. Replace in repo/files/models. It also contains JSONs of each models' best hyperparameters, which are used to train a new model for the dashboard. The folder for build is placed in repo/dashboard/frontend and is the built webpage. Static goes in repo/dashboard/backend/static, and contains a fitted model for random forest classifier and its test scores.

  • Educational Dataset Service Collection
Cite This Work

Coughlin, Robert K. (2023). Video Game Reviews Sentiment to Popularity. In Data Science & Engineering Master of Advanced Study (DSE MAS) Capstone Projects. UC San Diego Library Digital Collections.


Video Game Reviews Sentiment to Popularity is a project that sought to investigate the potential usage of video game reviews from aggregator websites to formulate a classification-based prediction for reaching a fixed threshold of number of users on Steam. Using a system of converting review documents to vectors before fitting them to a classification learner, the project was able to achieve a medium level of accuracy.

Date Collected
  • 2022-12-27 to 2023-06-09
Date Issued
  • 2023
  • Gupta, Amarnath
  • Coughlin, Robert K.
  • DSE MAS - 2023 Cohort
  • Capstone projects
  • Classification
  • Hyperparameter tuning
  • Natural Language Programming (NLP)
  • Supervised learning
  • Video games


View formats within this collection

  • data
  • text
  • English
Related Resource

Creative Commons Attribution 4.0 International Public License

Rights Holder
  • Coughlin, Robert K.

Under copyright (US)

Use: This work is available from the UC San Diego Library. This digital copy of the work is intended to support research, teaching, and private study.

Constraint(s) on Use: This work is protected by the U.S. Copyright Law (Title 17, U.S.C.). Use of this work beyond that allowed by "fair use" or any license applied to this work requires written permission of the copyright holder(s). Responsibility for obtaining permissions and any use and distribution of this work rests exclusively with the user and not the UC San Diego Library. Inquiries can be made to the UC San Diego Library program having custody of the work.

Digital Object Made Available By

Research Data Curation Program, UC San Diego, La Jolla, 92093-0175 (

Last Modified


