1

我们为简单的简历制作了一个程序,它逐行提取字符串中的整个简历信息。现在我想从那个字符串中提取 GPA。我尝试了很多,但对此一无所知。因此,如果有人可以配置这将对我非常有帮助。

import fitz
import os

pdfFiles = []
for filename in os.listdir('resume/'):
    if filename.endswith('.pdf'):
        print(filename)
        os.chdir('C:/Users/M. Abrar Hussain/Desktop/cv/resume')
        pdfFileObj = open(filename, 'rb')
        with fitz.open(pdfFileObj) as doc:
            text = ""
            for page in doc:
                text += page.getText()
            print(text)
            p = doc.loadPage(0)
            p_text = p.getText()
            p_lines = p_text.splitlines()
            print(p_lines)
            search_keywords = ['Laravel', 'Java', 'Python']
            # Comparing data with keywords
            lst = []
            for sentence in p_lines:
                for word in search_keywords:
                    if word in sentence:
                        lst.append(word)
            print(lst)

            score = 0
            w1 = []

            for w1 in lst:
                if w1 == 'Laravel':
                    score = score + 2
                if w1 == 'Python':
                    score = score + 2
                if w1 == 'Java':
                    score = score + 1

            print("The candidate has Score = %i" % score)
            print('\n')
        pdfFileObj.close()

此代码的输出是

cv1.pdf
Name: M. Abrar Hussain 
GPA: 3.5 
Skills: Python, Laravel 
Experience: 3 yr 

['Name: M. Abrar Hussain ', 'GPA: 3.5 ', 'Skills: Python, Laravel ', 'Experience: 3 yr ']
['Laravel', 'Python']
The candidate has Score = 4


cv2.pdf
Name: Danish Ejaz 
GPA: 3.7 
Skills: Python, Java 
Experience: 2.5 yr 

['Name: Danish Ejaz ', 'GPA: 3.7 ', 'Skills: Python, Java ', 'Experience: 2.5 yr ']
['Java', 'Python']
The candidate has Score = 3


cv3.pdf
Name: Abdullah 
GPA: 3.2 
Skills: Laravel, Java 
Experience: 2 yr 

['Name: Abdullah ', 'GPA: 3.2 ', 'Skills: Laravel, Java ', 'Experience: 2 yr ']
['Laravel', 'Java']
The candidate has Score = 3



Process finished with exit code 0

在这个输出中,我们可以将技能与关键字进行比较并给出分数。现在我们的主要重点是从字符串中提取 GPA 值并在比较后给出分数,就像我们之前对技能所做的那样

4

0 回答 0