[本文出自天外歸雲的部落格園]
押韻機器人簡介
近來群裡看到有人談起押韻機器,突然想起好多年前的回憶。
心血來潮寫了一個押韻機器人。可以識別韻腳、比較韻腳、詞彙列表按韻腳分類。
經測試,目前對多音字支援不好:比如嘮嗑,嘮叨。這種就識別會出錯。歡迎大家繼續測試,有問題反饋給我。
拼音識別基於pypinyin庫實現,具體用法詳見其github。
押韻機器人程式碼
押韻機器人程式碼檔案命名叫“punchliner.py”,程式碼如下:
from pypinyin import pinyin, lazy_pinyin, Style words = ["今天","太躁","艾福傑尼","著迷","太繞","心間","限","盛宴","榴蓮","虧欠","二百五","腐乳","火鍋底料","MC大笑","別跟我嘮","我感冒","好不好","太早","住口","兄弟","胸臆","太辣","太大","太炸","我手抖"] def is_alphabet(uchar): rule1 = (uchar >= u'\u0041' and uchar<=u'\u005a') rule2 = (uchar >= u'\u0061' and uchar<=u'\u007a') if rule1 or rule2: return True else: return False def get_punchline(word): last_character = word[len(word)-1] last_character_pinyin = pinyin(last_character)[0][0] punchline = [] for the_char in last_character_pinyin: if not is_alphabet(the_char): punchline.append(last_character_pinyin.split(the_char)[0]) punchline.append(the_char) punchline.append(last_character_pinyin.split(the_char)[1]) return punchline def compare_punchline(word1,word2): punchline1 = get_punchline(word1) punchline2 = get_punchline(word2) prefix1 = punchline1[0] prefix2 = punchline2[0] #字首尾字母設定不為空 prefix1_last_char = 'x' prefix2_last_char = 'x' if prefix1 != '': prefix1_last_char = prefix1[len(prefix1)-1] if prefix2 != '': prefix2_last_char = prefix2[len(prefix2)-1] #字首先決條件,都是i或都不是i才算押韻 pre_rule1 = (prefix1_last_char == 'i') pre_rule2 = (prefix2_last_char == 'i') all_i = (pre_rule1 and pre_rule2) all_not_i = 'i' not in [prefix1_last_char,prefix2_last_char] if all_i or all_not_i: rule1 = punchline1[1] == punchline2[1] rule2 = punchline1[2] == punchline2[2] if rule1 and rule2: return True else: return False else: return False def classify_punchline(words_list): target = words_list[0] yayun_words = filter(lambda word:compare_punchline(target,word)==True,words) yayun_words_list = list(set(yayun_words)) left_words_list = list(set(words_list)-set(yayun_words_list)) print(yayun_words_list) rule1 = left_words_list != words_list rule2 = len(left_words_list) > 0 if rule1 and rule2: classify_punchline(left_words_list) if __name__ == '__main__': #print(get_punchline("變")) #print(get_punchline("案")) #print(get_punchline("繞")) #print(compare_punchline("安","翻")) #print(compare_punchline("變","案")) #print(compare_punchline("房","狼")) #print(get_punchline("嘮")) classify_punchline(words)
其中:
1. 函式fuck_yayun可以對詞藻列表中的詞彙進行判斷,把押韻的詞彙進行自動歸類;
2. 函式get_punchline可以獲取詞彙韻腳;
3. 函式compare_punchline可以比較韻腳。
希望有朝一日可以像發明AlphaGo一樣發明AlphaRapper,讓他去參加中國有嘻哈。
執行結果: