欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

自然语言处理python进阶(二)

程序员文章站 2022-07-13 10:08:56
...

python字符串的简单使用

namesList = ['Tuffy','Ali','Nysha','Tim' ]
sentence = 'My dog sleeps on sofa'

names = ';'.join(namesList)
print(type(names), ':', names)
wordList = sentence.split(' ')
print((type(wordList)), ':', wordList)
additionExample = 'ganehsa' + 'ganesha' + 'ganesha'
multiplicationExample = 'ganesha' * 2
print('Text Additions :', additionExample)
print('Text Multiplication :', multiplicationExample)
str = 'Python NLTK'
print(str[1])
print(str[-3])

使用python读取pdf

from PyPDF2 import PdfFileReader

def getTextPDF(pdfFileName, password = ''):
    # 打开
    pdf_file = open(pdfFileName, 'rb')
    # 读取
    read_pdf = PdfFileReader(pdf_file)
    if password != '':
        read_pdf.decrypt(password)
    text = []
    for i in range(0,read_pdf.getNumPages()):
        # 转化为文本
        text.append(read_pdf.getPage(i).extractText())
    return '\n'.join(text)
pdfFile = 'sample-one-line.pdf'
pdfFi