programmer.land: August 2022

This project is really huge.
Now I'm going to analyze it's first release with very limited functionality.

Corpus of short texts indexed by:
- Words
- Morphems (when possible)
- Idioms (when possible)
Word explanations collected on demand
Data about user's performance:
- Frequency, time spent and payload of lessons
- User's dictionary: words what never were marked as unknown or stopped to be marked as such (with count of both states or list of docs)
- Attention: how many words are marked as unknown for the first time after a number of appearences
- Learning effectiveness: how many times is a word marked as unknown?

A progressive web app (PWA) storing user's data on user's machine.
- same code for desktop/tablet/mobile
- no installation and upgrades
- 100% private for the user
- user == user agent(browser or mobile device)
Store corpus and word explanations in hosted MongoDB
- MongoDB supports regular expression as data type.
- I know how to do it in SQL. Now it's time to learn NoSQL 😉.
- Hosted (not self-hosted) is a necessity. I'm not ready to run my own web server yet.
Use clustering for morphem and idiom extraction
- It should be able to accept hits from users (with grain of salt of course)
Use GraphQL for client-server interaction
- Easier to grow with the progect than REST
- Steeper learning curve
Use Text-to-Speech (TTS) functionality available in browser
- No need to store audio
- Easy synchronization with text highlight
For the beginning implement backend in Python
- Python is the most equiped language for ML
- It supports all I need in backend now

To be continued ...

Tuesday, August 9, 2022