主題: 002.08 新聞搜尋 PySimpleGUI + News API
建檔日期: 2019/12/15
更新日期: None
語言: Python 3.7.2, PySimpleGUI 4.11.0, newsapi 0.1.1
系統: Win10 Ver. 10.0.17763
002.08 新聞搜尋 PySimpleGUI + News API
最近看了一個提供超過30,000新聞來源的包, 為了方便自己搜尋實時新聞, 寫了一個簡單的軟體, 提供查詢一個月以內(免費使用者的限制)的相闗新聞簡單說明, 再進一步到原新聞來源看完整的新聞內容.
1. 軟體內容:
- 可選擇語言類別, 目前只提供阿拉伯文,中文,荷蘭文,英文,法語,德語,希伯來語,義大利語,北薩米語,挪威語,葡萄牙語,俄語,西班牙語,瑞典語.
- 可選擇起始日期到結束日期.
- 在文章標題和正文中搜尋的關鍵字或短語。
- 這裡支援高階搜尋:
- 用雙引號(“)括住短語以精確匹配。
- 必須帶有+符號的單詞或短語。 例如:+比特幣
- 不能帶有-符號的單詞。 例如:-bitcoin
- 可以使用AND / OR / NOT關鍵字,
- 可選地將這些內容用括號分組, 例如:crypto AND (ethereum OR litecoin) NOT bitcoin。
- 日期:免費使用者只能選擇不超過一個月的日期
- 速度:網頁資料載入的page_size越大,速度越慢。現在將其設定為100(最大),不要著急,請稍等片刻。
您可以將其更改為較小的數字,例如20。 - URL:單擊每個新聞的標題以瀏覽源URL。
2. 主要包PySimpleGUI以及newsapi的簡單說明
-
PySimple部份:
建立視窗基本如下
import PySimleGUI as sg layout = [[第一行元素(..., key='key1'), ....], [第二行元素(...,key='key2'), ....], ...., [第N行元素(...,key='keyN'), .....]] window = sg.Windows('標題', layout=layout, ....其他引數) while True: event, values = window.read() if event == None: break if event =='key1': do something if event =='key2': do something window.close()
- 元素基本上類似tkinter的部件, 為了便於使用, 只會有一些簡單必要的引數, 所以如果有特殊要求, 那就是另一回事了.
- 視窗布局以layout來表示, 有些元素還可以再建layout
- 'Key'用來在事件產生時, 代表元素(tkinter中稱為部件, 主要是避免混淆)
- 所有事件以window.read()讀取
-
newsapi部份:
from newsapi import NewsApiClient newsapi = NewsApiClient(api_key='1a8f46f807c44af9b261fae6ae659963') top_headlines = newsapi.get_top_headlines(q='bitcoin', sources='bbc-news,the-verge', category='business', language='en', country='us') all_articles = newsapi.get_everything(q='bitcoin', sources='bbc-news,the-verge', domains='bbc.co.uk,techcrunch.com', from_param='2017-12-01', to='2017-12-12', language='en', sort_by='relevancy', page=2) sources = newsapi.get_sources()
- 建立客戶端類 NewsApiClient()
- 使用唯有的三個方法: get_top_headlines(), get_everything() 以及 newsapi.get_sources
- get_top_headlines():提供實時的頭條新聞和重要新聞.
- get_everything(): 搜尋來自30,000多個大型和小型新聞來源和部落格的數百萬篇文章
- newsapi.get_sources(): 可用於跟蹤可用的釋出者,並且可以將其直接傳遞給使用者。
3. 輸出畫面
4. 程式碼
注意: 程式碼中有一行my_api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx', 這是授權碼, 可以到newsapi網站上申請取得.
#!/usr/bin/python
'''
Search worldwide news with PySimpleGUI code & news API
Get breaking news headlines, and search for articles from over 30,000 news
sources and blogs with news API. News API is a simple and easy-to-use API
that returns JSON metadata for headlines and articles live all over the web
right now.
'''
import PySimpleGUI as sg
from tkinter import font as FONT
from newsapi import NewsApiClient
from PIL import Image
from io import BytesIO
import requests
import _thread
import webbrowser
import datetime
import dateutil.relativedelta
import base64
import ctypes
import os
class News():
'''
News class: Capture news by newsapi and load photo from souce web sites
'''
def __init__(self, text):
self.date = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
self.coding = 'utf-8'
self.text = text
self.stop = False
self.raw_data = self.read()
self.length = len(self.raw_data)
self.data = self.convert()
self.width = [0 for i in range(self.length)]
self.height = [0 for i in range(self.length)]
self.base64 = [0 for i in range(self.length)]
self.photo = []
self.where = 0
def convert(self):
# Convert raw data structure to my data structure
if self.length == 0: return []
result = [{} for i in range(self.length)]
for i in range(self.length):
for key, value in self.raw_data[i].items():
if key == 'source': value = self.raw_data[i]['source']['name']
if value == None: value = ''
new = [('<b>', ''), ('</b>', ''), ('\n', ' '), ('’',"'"),
('”',"'"), ('“',"'")]
mrep = lambda s, d: s if not d else mrep(s.replace(*d.pop()), d)
value = mrep(value, new)
value = value
result[i][key] = value
return result
def read(self):
# Load news from newapi web site
try:
newsapi = NewsApiClient(api_key=my_api_key)
result = newsapi.get_everything(
q=self.text,
language=Language[default],
page_size=page_size,
from_param=start, to=stop)
except:
sg.popup('Server link failed !')
return []
if result['status'] != 'ok':
sg.popup('Server link failed !')
return []
return result['articles']
def update(self):
# Update photos by call thread
if self.length == 0: return
for i in range(self.length):
# self.image(i) # Slow, but safe
_thread.start_new_thread(self.image, (i,)) # Quick, but bug
def image(self, i):
# Draw image on window canvas
if self.stop:
return
if not self.load(i): # load photo from web site by URL
ids = draw.DrawText('X',
(gap*2+int(im_w/2), canv_h-gap*2-int(im_h/2)-i*(gap*3.5+im_h)),
color='white', font=font)
return
offset = i*(gap*3.5+im_h)
ids = draw.DrawImage(data=self.base64[i],
location=(gap*2+(im_w-self.width[i])/2,
canv_h-gap*2.5-(im_h-self.height[i])/2-offset-self.where))
news.photo.append(ids)
return
def load(self, i):
# load, resize and convert to base64
url = self.data[i]['urlToImage']
if url == '': return False
try:
response = requests.get(url)
if response.status_code != requests.codes.ok:
return
im = Image.open(BytesIO(response.content))
except:
print('Failed: request/status code/open', url)
return False
if im.width==0 or im.height==0:
return False
im = im.convert(mode='RGBA')
if im.width*ratio >= im.height:
self.width[i], self.height[i] = im_w, int(im.height*im_w/im.width)
else:
self.width[i], self.height[i] = int(im.width*im_h/im.height), im_h
im = im.resize((self.width[i], self.height[i]), resample=Image.LANCZOS)
buffered = BytesIO()
im.save(buffered, format="PNG")
self.base64[i] = base64.b64encode(buffered.getvalue())
return True
def wheel(event):
# Mouse wheel event handler
delta = int(event.delta/2)
limit = -total_length+canv_h
if delta < 0:
if news.where+delta <= limit:
delta = limit - news.where
news.where = limit
else:
news.where += delta
elif delta > 0:
if news.where+delta >= 0:
delta = -news.where
news.where = 0
else:
news.where += delta
draw.Move(0, -delta)
def split(txt):
# Split text for space, ASCII string, non-Unicode char into list
txt = txt.strip()
if txt is '':
return []
result = []
string = ''
for i in range(len(txt)):
if txt[i] in [' ', '\n','\r']:
if string is not '':
result.append(string)
result.append(' ')
string = ''
elif txt[i] in ASCII:
string += txt[i]
else:
if string is not '':
result.append(string)
result.append(txt[i])
string = ''
if string != '':
result.append(string)
return result
def wrap(txt, dist, lines_limit):
# Wrap string by add '\n' into string for pixel width limit
if txt is '':
return '', 1
tmp = split(txt)
old_string = ''
string = ''
result = ''
length = len(tmp)
len_1 = length - 1
lines = 0
for i in range(length):
string += tmp[i]
if s.measure(string) > dist:
result += old_string + '\n'
lines += 1
if tmp[i] is ' ':
string = old_string = ''
else:
string = old_string = tmp[i]
else:
old_string = string
if lines == lines_limit:
old_string = ''
break
if old_string is not '':
result += old_string
lines += 1
return result
def Layout():
# Window main Layout
layout = [[sg.Text('Language', font=font, pad=((40,0),0)),
sg.Combo(values=language, default_value=default, size=(20,1),
enable_events=True, key='Combo', readonly=True, font=font),
sg.CalendarButton(start, size=(12,1), target='date1',
key='date1', format=date_fmt, font=font),
sg.CalendarButton(stop, size=(12,1), target='date2',
key='date2', format=date_fmt, font=font),
sg.Text('Key Words', font=font, pad=((5,0),0)),
sg.InputText(size=(50,1), font = font, pad=((10,0),0),
do_not_clear=True, focus=True)]]
return layout
def update_window():
global draw, total_length
# Update window when new search
global s
s = FONT.Font(family='Segoe', size=16)
if news.length == 0:
sg.popup('No news found or server failed')
return None
total_length = (3.5*gap+im_h)*news.length+gap
layout = Layout() + [[sg.Graph(canvas_size=(canv_w, canv_h), key='Graph',
graph_bottom_left=(0,0), graph_top_right=(win_w, win_h),
enable_events=True)]]
window = sg.Window('News Center', layout=layout, finalize=True,
return_keyboard_events=True)
draw = window['Graph']
for i in range(news.length):
# Each News
title = wrap(str(i+1)+'. '+news.data[i]['title'], title_w, title_h)
# Wrap description by desc_width
if news.data[i]['description'] is '':
desc = 'No description...'
else:
desc = wrap(news.data[i]['description'], desc_w, desc_h)
offset = i*(gap*3.5+im_h)
draw.DrawRectangle((gap, canv_h-gap-offset),
(canv_w-gap, canv_h-gap*3.5-im_h-offset), line_color='grey',
line_width=1)
draw.DrawRectangle((gap*2, canv_h-offset-gap+16),
(canv_w-gap*2, canv_h-offset-gap-16), line_color='green',
fill_color='green')
draw.DrawText(title, (gap*2+12, canv_h-int(gap/2)-offset),
color='white', font=font, text_location='n'+'w')
draw.DrawText(desc, (gap*3+im_w, canv_h-gap*2.5-offset),
color='white', font=font, text_location='n'+'w')
window['Graph'].Widget.bind('<MouseWheel>', wheel)
return window
ctypes.windll.user32.SetProcessDPIAware() # Set unit of GUI to pixels
# Usable option of Language for free user
Language = {'Arabic':'ar', 'Chinese':'zh', 'Dutch':'nl', 'English':'en',
'French':'fr', 'German':'de', 'Hebrew':'he', 'Italian':'it',
'Northern Sami':'se', 'Norwegian':'no', 'Portuguese':'pt',
'Russian':'ru', 'Spanish':'es', 'Swedish':'sv'}
language = list(Language.keys())
language.sort()
ASCII = [chr(i) for i in range(256)]
font = 'Segoe 16'
pad = 20
default = 'English'
date_fmt = '%Y-%m-%d'
now = datetime.datetime.now()
stop = now.strftime(date_fmt)
start = (now + dateutil.relativedelta.relativedelta(months=-1))
start = start.strftime(date_fmt)
month = start
page_size = 100 # 100 Max, more page_size, more slow
win_w = 1620
win_h = 720
im_w = 326
im_h = 145
ratio = im_h/im_w
canv_w = win_w
canv_h = win_h
gap = 25
title_w = canv_w - 4*gap - 12
title_h = 1
desc_w = canv_w - 5*gap - im_w
desc_h = 5
# You can get your API-Key on https://newsapi.org/register
my_api_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
help = '''
Keywords or phrases to search for in the article title and body.
Advanced search is supported here:
► Surround phrases with quotes (") for exact match.
► Prepend words or phrases that must appear with a + symbol. Eg: +bitcoin
► Prepend words that must not appear with a - symbol. Eg: -bitcoin
► Alternatively you can use the AND / OR / NOT keywords,
and optionally group these with parenthesis.
Eg: crypto AND (ethereum OR litecoin) NOT bitcoin.
Date: Should be selected not more than one month before now for free user.
Speed: It will be more slower as higer page_size for web data load.
Now it is set to 100 (Max), not hurry, just wait a moment.
You can change it to smaller number, like 20.
URL: Click on title of each news to browse source URL.
'''
sg.change_look_and_feel('DarkBrown2')
layout = Layout() + [[sg.Graph(canvas_size=(canv_w, canv_h), key='Graph',
graph_bottom_left=(0,0), graph_top_right=(win_w, win_h))]]
window = sg.Window('News Center', layout=layout, finalize=True,
return_keyboard_events=True)
draw = window['Graph'].DrawText(help, (canv_w/2, canv_h/2),
color='white', font=font)
while True:
event, values = window.read()
# Window Close
if event == None:
break
# Search Starting by Enter key pressed
if event == '\r':
if len(values[0])!=0:
# Update date information, free user limited in 1-month news
new_start = window['date1'].GetText()
new_stop = window['date2'].GetText()
start = new_start if new_start >= month else start
stop = new_stop if new_stop >= month else stop
if stop < start:
start, stop = stop, start
layout1 = []
news = News(values[0])
news.stop = True
window1 = update_window()
if window1 != None:
window.close()
window = window1
news.stop=False
news.update()
if event=='Graph':
# News link clicked, transfer to web browser
dist = (canv_h-values['Graph'][1]-news.where)
off = dist % (3.5*gap+im_h) - gap
index = int(dist / (3.5*gap+im_h))
if ((-16<=off<=16) and (2*gap<=values['Graph'][0]<=canv_w-2*gap)
and (index < news.length)):
webbrowser.open(news.data[index]['url'])
if event == 'Combo':
# Set default value to selection
default = values['Combo']
window.close()