python - 如何匹配文本每個(gè)單詞在另一個(gè)文本中的單詞,及該單詞對(duì)應(yīng)的值?
問(wèn)題描述
文本ttt.txt內(nèi)容:president said would bill program loan farmers corn committee department agriculture usda house 文本sss.txt內(nèi)容:Topic 0th:
said 0.045193would 0.028879bill 0.011087program 0.010718loan 0.008395farmers 0.008237corn 0.008078committee 0.007022department 0.006811agriculture 0.006653usda 0.006547house 0.006494president
Topic 1th:
said 0.044315shares 0.031928stock 0.028001company 0.023888group 0.017063offer 0.016408share 0.016268dlrs 0.016034corp 0.015520common 0.013463president 0.000047
如何在sss中匹配ttt中每個(gè)單詞分別在2個(gè)主題下的單詞及對(duì)應(yīng)的值?
問(wèn)題解答
回答1:# coding: utf8result = {}with open(’ttt.txt’) as f_t, open(’sss.txt’) as f_s: key_set = set(f_t.read().split()) # 將ttt的每個(gè)單詞存到key集合 topic = ’’ for line in f_s:if line.startswith(’Topic’): # 儲(chǔ)存每個(gè)Topic topic = line.strip() result[topic] = {}else: line_split = line.split() if len(line_split) < 2:line_split.append(’None’) # 防止沒(méi)有值的key key, value = line_split if key in key_set: # 如果第一列在key集合內(nèi) 就收集值result[topic].update({ key: value})print(result)
