The process is complicated, what you need to do is
- Extract text from image in lang=eng
- Pass that text to langdetect it is google automatic language detection library
- Again use that language in tesseract to extract text accurately
you can use switch case with every language and pass sample text to langdetect to get probability which language is correct.
# Include the above line, if you don't have tesseract executable in your path
# Example tesseract_cmd: 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
sample_text = pytesseract.image_to_string(Image.open('image.jpg'), lang='eng')
from langdetect import detect_langs detect_langs(sample_text)