vsv_proj/README.md

# vsv_proj

Machine vision project

### Place images to ocr in a folder called input.
Dependencies provided in `pyproject.toml`.
To run and work on:
* Install python >=3.9
* Install `tesseract tesseract-data-eng` using your favorite package manager
* Install poetry: `python -m pip install poetry`
    * Create poetry config location: `mkdir ~/.config/pypoetry`
    * Tell poetry to separate venvs: `echo '[virtualenvs]\nin-project=true\n' > ~/.config/pypoetry/config.toml`
* Clone this repo
* cd into repo: `cd vsv_proj`
* Create folder for input input images: `mkdir input`
* Place extraced images into input folder
* Create virtual env: `python -m venv .venv`
* Install dependencies: `python -m poetry install`
* Activate venv: `. .venv/bin/activate`
* Run script: `python ocr.py`

To run and work on in w*ndows (cmd):
* Install [python](https://www.python.org/downloads/) >=3.9
* Install [tesseract](https://github.com/UB-Mannheim/tesseract/wiki), **Use default installation path!**
* Add tesseract to PATH (using admin cmd): `setx Path "C:\Program Files\Tesseract-OCR;%PATH%"`
* Close and reopen cmd window(s)
* Install poetry: `python.exe -m pip install poetry`
    * Create poetry config location: `mkdir %APPDATA%\pypoetry`
    * Tell poetry to separate venvs: `echo [virtualenvs] > %APPDATA%\pypoetry\config.toml && echo.in-project=true >> %APPDATA%\pypoetry\config.toml`
* Clone or download and extract the repo
* Navigate to the cloned/extracted repo: `cd PATH-TO-FOLDER`
* Create folder for input images: `mkdir input`
* Place extraced images into input folder
* Create virtual env: `python.exe -m venv .venv`
* Install dependencies: `python.exe -m poetry install`
* Activate venv: `".venv\Scripts\activate.bat"`
* Run script: `python.exe ocr.py`
Note: To deactivate venv run `".venv\Scripts\deactivate.bat"`