我们为简单的简历制作了一个程序,它逐行提取字符串中的整个简历信息。现在我想从那个字符串中提取 GPA。我尝试了很多,但对此一无所知。因此,如果有人可以配置这将对我非常有帮助。
import fitz
import os
pdfFiles = []
for filename in os.listdir('resume/'):
if filename.endswith('.pdf'):
print(filename)
os.chdir('C:/Users/M. Abrar Hussain/Desktop/cv/resume')
pdfFileObj = open(filename, 'rb')
with fitz.open(pdfFileObj) as doc:
text = ""
for page in doc:
text += page.getText()
print(text)
p = doc.loadPage(0)
p_text = p.getText()
p_lines = p_text.splitlines()
print(p_lines)
search_keywords = ['Laravel', 'Java', 'Python']
# Comparing data with keywords
lst = []
for sentence in p_lines:
for word in search_keywords:
if word in sentence:
lst.append(word)
print(lst)
score = 0
w1 = []
for w1 in lst:
if w1 == 'Laravel':
score = score + 2
if w1 == 'Python':
score = score + 2
if w1 == 'Java':
score = score + 1
print("The candidate has Score = %i" % score)
print('\n')
pdfFileObj.close()
此代码的输出是
cv1.pdf
Name: M. Abrar Hussain
GPA: 3.5
Skills: Python, Laravel
Experience: 3 yr
['Name: M. Abrar Hussain ', 'GPA: 3.5 ', 'Skills: Python, Laravel ', 'Experience: 3 yr ']
['Laravel', 'Python']
The candidate has Score = 4
cv2.pdf
Name: Danish Ejaz
GPA: 3.7
Skills: Python, Java
Experience: 2.5 yr
['Name: Danish Ejaz ', 'GPA: 3.7 ', 'Skills: Python, Java ', 'Experience: 2.5 yr ']
['Java', 'Python']
The candidate has Score = 3
cv3.pdf
Name: Abdullah
GPA: 3.2
Skills: Laravel, Java
Experience: 2 yr
['Name: Abdullah ', 'GPA: 3.2 ', 'Skills: Laravel, Java ', 'Experience: 2 yr ']
['Laravel', 'Java']
The candidate has Score = 3
Process finished with exit code 0
在这个输出中,我们可以将技能与关键字进行比较并给出分数。现在我们的主要重点是从字符串中提取 GPA 值并在比较后给出分数,就像我们之前对技能所做的那样