python复制word段落,在Word文件中找到标题，然后使用python将整个段落复制到新的Word文件中...-优快云博客

I have the following situation:

I have several hundred word files that contain company information. I would like to search these files for specific words to find specific paragraphs and copy just these paragraphs to new word files. Basically I just need to reduce the original couple hundred documents to a more readable size each.

The documents that I have are located in one directory and carry different names. In each of them I want to extract particular information that I need to define individually.

To go about this I started with the following code to first write all file names into a .csv file:

# list all transcript files and print names to .csv

import os

import csv

with open("C:\\Users\\Stef\\Desktop\\Files.csv", 'w') as f:

writer = csv.writer(f)

for path, dirs, files in os.walk("C:\\Users\\Stef\\Desktop\\Files"):

for filename in files:

writer.writerow([filename])

This works perfectly. Next I open Files.csv and edit the second column for the keywords that I need to search for in each document.

See picture below for how the .csv file looks:

The couple hundred word files I have, are structured with different layers of headings. What I wanted to do now was to search for specific headings with the keywords I manually defined in the .csv and then copy the content of the following passage to a new file. I uploaded an extract from a word file, "Presentation" is a 'Heading 1' and "North America" and "China" are 'Heading 2'.

In this case I would like for example to search for the 'Headline 2' "North America" and then copy the text that is below ("In total [...] diluted basis.) to a new word file that has the same name as the old one just an added "_clean.docx".

I started with my code as follows:

import os

import glob

import csv

import docx

os.chdir('C:\\Users\\Stef\\Desktop')

f = open('Files.csv')

csv_f = csv.reader(f)

file_name = []

matched_keyword = []

for row in csv_f:

file_name.append(row[0])