我正在寻找可以告诉我文本语言的算法(例如,你好 - 英语,Bonjour - 法语,Servicio - 西班牙语)并且还能纠正英语单词的拼写错误。我已经探索过 Google 的 TextBlob,它非常相关,但是一旦我的代码开始执行,它就会出现“请求过多”错误。我也开始探索 Polyglot,但在 Windows 上下载库时遇到了很多问题。
TextBlob 的代码
*import pandas as pd
from tkinter import filedialog
from textblob import TextBlob
import time
from time import sleep
colnames = ['Word']
x=filedialog.askopenfilename(title='Select the word list')
print("Data to be checked: " + x)
df = pd.read_excel(x,sheet_name='Sheet1',header=0,names=colnames,na_values='?',dtype=str)
words = df['Word']
i=0
Language_detector=pd.DataFrame(columns=['Word','Language','corrected_word','translated_word'])
for word in words:
b = TextBlob(word)
language_word=b.detect_language()
time.sleep(0.5)
if language_word in ['en','EN']:
corrected_word=b.correct()
time.sleep(0.5)
Language_detector.loc[i, ['corrected_word']]=corrected_word
else:
translated_word=b.translate(to='en')
time.sleep(0.5)
Language_detector.loc[i, ['Word']]=word
Language_detector.loc[i, ['Language']]=language_word
Language_detector.loc[i, ['translated_word']]=translated_word
i=i+1
filename="Language detector test v 1.xlsx"
Language_detector.to_excel(filename,sheet_name='Sheet1')
print("Languages identified for the word list")**