被動資訊蒐集 - Python安全攻防

Nu1L發表於2021-04-28

概述:

被凍資訊蒐集主要通過搜尋引擎或者社交等方式對目標資產資訊進行提取,通常包括IP查詢,Whois查詢,子域名蒐集等。進行被動資訊蒐集時不與目標產生互動,可以在不接觸到目標系統的情況下挖掘目標資訊。

主要方法:DNS解析,子域名挖掘,郵件爬取等。

DNS解析:

1、概述:

DNS(Domain Name System,域名系統)是一種分散式網路目錄服務,主要用於域名與IP地址的相互轉換,能夠使使用者更方便地訪問網際網路,而不用去記住一長串數字(能夠被機器直接讀取的IP)。

2、IP查詢:

IP查詢是通過當前所獲取的URL去查詢對應IP地址的過程。可以利用Socket庫函式中的gethostbyname()獲取域名對應的IP值。

程式碼:

import socket

ip = socket.gethostbyname('www.baidu.com')
print(ip)

返回:

39.156.66.14

3、Whois查詢:

Whois是用來查詢域名的IP以及所有者資訊的傳輸協議。Whois相當於一個資料庫,用來查詢域名是否已經被註冊,以及註冊域名的詳細資訊(如域名所有人,域名註冊商等)。

Python中的python-whois模組可用於Whois查詢。

程式碼:

from whois import whois

data = whois('www.baidu.com')
print(data)

返回:

E:\python\python.exe "H:/code/Python Security/Day01/Whois查詢.py"
{
  "domain_name": [
    "BAIDU.COM",
    "baidu.com"
  ],
  "registrar": "MarkMonitor, Inc.",
  "whois_server": "whois.markmonitor.com",
  "referral_url": null,
  "updated_date": [
    "2020-12-09 04:04:41",
    "2021-04-07 12:52:21"
  ],
  "creation_date": [
    "1999-10-11 11:05:17",
    "1999-10-11 04:05:17"
  ],
  "expiration_date": [
    "2026-10-11 11:05:17",
    "2026-10-11 00:00:00"
  ],
  "name_servers": [
    "NS1.BAIDU.COM",
    "NS2.BAIDU.COM",
    "NS3.BAIDU.COM",
    "NS4.BAIDU.COM",
    "NS7.BAIDU.COM",
    "ns3.baidu.com",
    "ns2.baidu.com",
    "ns7.baidu.com",
    "ns1.baidu.com",
    "ns4.baidu.com"
  ],
  "status": [
    "clientDeleteProhibited https://icann.org/epp#clientDeleteProhibited",
    "clientTransferProhibited https://icann.org/epp#clientTransferProhibited",
    "clientUpdateProhibited https://icann.org/epp#clientUpdateProhibited",
    "serverDeleteProhibited https://icann.org/epp#serverDeleteProhibited",
    "serverTransferProhibited https://icann.org/epp#serverTransferProhibited",
    "serverUpdateProhibited https://icann.org/epp#serverUpdateProhibited",
    "clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)",
    "clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)",
    "clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)",
    "serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)",
    "serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)",
    "serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)"
  ],
  "emails": [
    "abusecomplaints@markmonitor.com",
    "whoisrequest@markmonitor.com"
  ],
  "dnssec": "unsigned",
  "name": null,
  "org": "Beijing Baidu Netcom Science Technology Co., Ltd.",
  "address": null,
  "city": null,
  "state": "Beijing",
  "zipcode": null,
  "country": "CN"
}

Process finished with exit code 0

 

子域名挖掘:

1、概述:

域名可以分為頂級域名,一級域名,二級域名等。

子域名(subdomain)是頂級域名(一級域名或父域名)的下一級。

在測試過程中,測試目標主站時如果未發現任何相關漏洞,此時通常會考慮挖掘目標系統的子域名。

子域名挖掘方法有多種,例如,搜尋引擎,子域名破解,字典查詢等。

2、利用Python編寫一個簡單的子域名挖掘工具:

(以https://cn.bing.com/為例)

程式碼:

# coding=gbk
import requests
from bs4 import BeautifulSoup
from urllib.parse import urlparse
import sys


def Bing_Search(site, pages):
    Subdomain = []
    # 以列表的形式儲存子域名
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
        'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
        'Referer': 'https://cn.bing.com/',
        'Cookie': 'MUID=37FA745F1005602C21A27BB3117A61A3; SRCHD=AF=NOFORM; SRCHUID=V=2&GUID=DA7BDD699AFB4AEB8C68A0B4741EFA74&dmnchg=1; MUIDB=37FA745F1005602C21A27BB3117A61A3; ULC=P=9FD9|1:1&H=9FD9|1:1&T=9FD9|1:1; PPLState=1; ANON=A=CEC39B849DEE39838493AF96FFFFFFFF&E=1943&W=1; NAP=V=1.9&E=18e9&C=B8-HXGvKTE_2lQJ0I3OvbJcIE8caEa9H4f3XNrd3z07nnV3pAxmVJQ&W=1; _tarLang=default=en; _TTSS_IN=hist=WyJ6aC1IYW5zIiwiYXV0by1kZXRlY3QiXQ==; _TTSS_OUT=hist=WyJlbiJd; ABDEF=V=13&ABDV=13&MRB=1618913572156&MRNB=0; KievRPSSecAuth=FABSARRaTOJILtFsMkpLVWSG6AN6C/svRwNmAAAEgAAACPyKw8I/CYhDEAFiUHPfZQSWnp%2BMm43NyhmcUtEqcGeHpvygEOz6CPQIUrTCcE3VESTgWkhXpYVdYAKRL5u5EH0y3%2BmSTi5KxbOq5zlLxOf61W19jGuTQGjb3TZhsv5Wb58a2I8NBTwIh/cFFvuyqDM11s7xnw/ZZoqc9tNuD8ZG9Hi29RgIeOdoSL/Kzz5Lwb/cfSW6GbawOVtMcToRJr20K0C0zGzLhxA7gYH9CxajTo7w5kRx2/b/QjalnzUh7lvZCNrF5naagj10xHhZyHItlNtjNe3yqqLyLZmgNrzT8o7QWfpJWHqAak4AFt3nY9R0NGLHM6UxPC8ph9hEaYbWtIsY7JNvVYFwbDk6o4oqu33kHeyqW/JTVhQACnpn2v74dZzvk4xRp%2BpcQIoRIzI%3D; _U=1ll1JNraa8gnrWOg3NTDw_PUniDnXYIikDzB-R_hVgutXRRVFcrnaPKxVBXA1w-dBZJsJJNfk6vGHSqJtUsLXvZswsd5A1xFvQ_V_nUInstIfDUs7q7FyY2DmvDRlfMIqbgdt-KEqazoz-r_TLWScg4_WDNFXRwg6Ga8k2cRyOTfGNkon7kVCJ7IoPDTAdqdP; WLID=kQRArdi2czxUqvURk62VUr88Lu/DLn6bFfcwTmB8EoKbi3UZYvhKiOCdmPbBTs0PQ3jO42l3O5qWZgTY4FNT8j837l8J9jp0NwVh2ytFKZ4=; _EDGE_S=SID=01830E382F4863360B291E1B2E6662C7; SRCHS=PC=ATMM; WLS=C=3d04cfe82d8de394&N=%e5%81%a5; SRCHUSR=DOB=20210319&T=1619277515000&TPC=1619267174000&POEX=W; SNRHOP=I=&TS=; _SS=PC=ATMM&SID=01830E382F4863360B291E1B2E6662C7&bIm=656; ipv6=hit=1619281118251&t=4; SRCHHPGUSR=SRCHLANGV2=zh-Hans&BRW=W&BRH=S&CW=1462&CH=320&DPR=1.25&UTC=480&DM=0&WTS=63754766339&HV=1619277524&BZA=0&TH=ThAb5&NEWWND=1&NRSLT=-1&LSL=0&SRCHLANG=&AS=1&NNT=1&HAP=0&VSRO=0'
    }
    for i in range(1, int(pages)+1):
        url = "https://cn.bing.com/search?q=site%3a" + site + "&go=Search&qs=ds&first=" + str((int(i)-1)*10) + "&FORM=PERE"
        html = requests.get(url, headers=headers)
        soup = BeautifulSoup(html.content, 'html.parser')
        job_bt = soup.findAll('h2')
        for i in job_bt:
            link = i.a.get('href')
            domain = str(urlparse(link).scheme + "://" + urlparse(link).netloc)
            if domain in Subdomain:
                pass
            else:
                Subdomain.append(domain)
                print(domain)


if __name__ == '__main__':
    if len(sys.argv) == 3:
        site = sys.argv[1]
        page = sys.argv[2]
    else:
        print("usge: %s baidu.com 10" % sys.argv[0])
        # 輸出幫助資訊
        sys.exit(-1)
    Subdomain = Bing_Search('www.baidu.com', 15)

返回:

 

郵件爬取:

1、概述:

在針對目標系統進行滲透的過程中,如果目標伺服器安全性很高,通過伺服器很難獲取目標許可權時,通常會採用社工的方式對目標服務進行進一步攻擊。

針對搜尋介面的相關郵件資訊進行爬取、處理等操作之後。利用獲得的郵箱賬號批量傳送釣魚郵件,誘騙、欺詐目標使用者或管理員進行賬號登入或點選執行,進而獲取目標系統的其許可權。

該郵件採集工具所用到的相關庫函式如下:

import sys
import getopt
import requests
from bs4 import BeautifulSoup
import re

2、過程:

在程式的起始部分,當執行過程中沒有發生異常時,則執行定義的start()函式。

通過sys.argv[ ] 實現外部指令的接收。其中,sys.argv[0] 代表程式碼本身的檔案路徑,sys.argv[1:] 表示從第一個命令列引數到輸入的最後一個命令列引數,儲存形式為list。

程式碼如下:

if __name__ == '__main__':
    # 定義異常
    try:
        start(sys.argv[1: ])
    except:
        print("interrupted by user, killing all threads ... ")

:編寫命令列引數處理功能。此處主要應用  getopt.getopt()函式處理命令列引數,該函式目前有短選項和長選項兩種格式。

短選項格式為“ - ”加上單個字母選項;

長選項格式為“ -- ”加上一個單詞選項。

opts為一個兩元組列表,每個元素形式為“(選項串,附加引數)”。當沒有附加引數時,則為空串。之後通過for語句迴圈輸出opts列表中的數值並賦值給自定義的變數。

程式碼如下:

def start(argv):
    url = ""
    pages = ""
    if len(sys.argv) < 2:
        print("-h 幫助資訊;\n")
        sys.exit()
    # 定義異常處理
    try:
        banner()
        opts, args = getopt.getopt(argv, "-u:-p:-h")
    except:
        print('Error an argument')
        sys.exit()
    for opt, arg in opts:
        if opt == "-u":
            url = arg
        elif opt == "-p":
            pages = arg
        elif opt == "-h":
            print(usage())
    launcher(url, pages)

:輸出幫助資訊,增加程式碼工具的可讀性和易用性。為了使輸出的資訊更加美觀簡潔,可以通過轉義字元設定輸出字型顏色,從而實現所需效果。

開頭部分包含三個引數:顯示方式,前景色,背景色。這三個引數是可選的,可以只寫其中一個引數。結尾可以省略,但為了書寫規範,建議以 “\033[0m” 結尾。

程式碼如下:

print('\033[0:30;41m 3cH0 - Nu1L \033[0m')
print('\033[0:30;42m 3cH0 - Nu1L \033[0m')
print('\033[0:30;43m 3cH0 - Nu1L \033[0m')
print('\033[0:30;44m 3cH0 - Nu1L \033[0m')
# banner資訊
def banner():
    print('\033[1:34m ################################ \033[0m\n')
    print('\033[1:34m 3cH0 - Nu1L \033[0m\n')
    print('\033[1:34m ################################ \033[0m\n')
# 使用規則
def usage():
    print('-h: --help 幫助;')
    print('-u: --url 域名;')
    print('-p --pages 頁數;')
    print('eg: python -u "www.baidu.com" -p 100' + '\n')
    sys.exit()

:確定搜尋郵件的關鍵字,並呼叫bing_search()和baidu_search()兩個函式,返回Bing與百度兩大搜尋引擎的查詢結果。由獲取到的結果進行列表合併,去重之後,迴圈輸出。

程式碼如下:

# 漏洞回撥函式
def launcher(url, pages):
    email_num = []
    key_words = ['email', 'mail', 'mailbox', '郵件', '郵箱', 'postbox']
    for page in range(1, int(pages)+1):
        for key_word in key_words:
            bing_emails = bing_search(url, page, key_word)
            baidu_emails = baidu_search(url, page, key_word)
            sum_emails = bing_emails + baidu_emails
            for email in sum_emails:
                if email in email_num:
                    pass
                else:
                    print(email)
                    with open('data.txt', 'a+')as f:
                        f.write(email + '\n')
                    email_num.append(email)

:用Bing搜尋引擎進行郵件爬取。Bing引擎具有反爬防護,會通過限定referer、cookie等資訊來確定是否網頁爬取操作。

可以通過指定referer與requeses.session()函式自動獲取cookie資訊,繞過Bing搜尋引擎的反爬防護。

程式碼如下:

# Bing_search
def bing_search(url, page, key_word):
    referer = "http://cn.bing.com/search?q=email+site%3abaidu.com&sp=-1&pq=emailsite%3abaidu.com&first=1&FORM=PERE1"
    conn = requests.session()
    bing_url = "http://cn.bing.com/search?q=" + key_word + "+site%3a" + url + "&qa=n&sp=-1&pq=" + key_word + "site%3a" + url +"&first=" + str((page-1)*10) + "&FORM=PERE1"  
    conn.get('http://cn.bing.com', headers=headers(referer))
    r = conn.get(bing_url, stream=True, headers=headers(referer), timeout=8)
    emails = search_email(r.text)
    return emails

:用百度搜尋引擎進行郵件爬取。百度搜尋引擎同樣設定了反爬防護,相對Bing來說,百度不僅對referer和cookie進行校驗,還同時在頁面中通過JavaScript語句進行動態請求連結,從而導致不能動態獲取頁面中的資訊。

可以通過對連結的提取,在進行request請求,從而繞過反爬設定。

程式碼如下:

# Baidu_search
def baidu_search(url, page, key_word):
    email_list = []
    emails = []
    referer = "https://www.baidu.com/s?wd=email+site%3Abaidu.com&pn=1"
    baidu_url = "https://www.baidu.com/s?wd=" + key_word + "+site%3A" + url + "&pn=" + str((page-1)*10)
    conn = requests.session()   
    conn.get(baidu_url, headers=headers(referer))
    r = conn.get(baidu_url, headers=headers(referer))
    soup = BeautifulSoup(r.text, 'lxml')
    tagh3 = soup.find_all('h3')
    for h3 in tagh3:
        href = h3.find('a').get('href')
        try:
            r = requests.get(href, headers=headers(referer))
            emails = search_email(r.text)   
        except Exception as e:
            pass
        for email in emails:
            email_list.append(email)
    return email_list

:通過正規表示式獲取郵箱號碼。此處也可以換成目標企業郵箱的正規表示式。

程式碼如下:

# search_email
def search_email(html):
    emails = re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]" + html, re.I)
    return emails
# headers(referer)
def headers(referer):
    headers = {
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
        'Accept-Encoding': 'gzip, deflate, br',
        'Referer': referer
    }
    return headers

3、完整程式碼:

# coding=gbk
import sys
import getopt
import requests
from bs4 import BeautifulSoup
import re


# 主函式,傳入使用者輸入的引數
def start(argv):
    url = ""
    pages = ""
    if len(sys.argv) < 2:
        print("-h 幫助資訊;\n")
        sys.exit()
    # 定義異常處理
    try:
        banner()
        opts, args = getopt.getopt(argv, "-u:-p:-h")
    except:
        print('Error an argument')
        sys.exit()
    for opt, arg in opts:
        if opt == "-u":
            url = arg
        elif opt == "-p":
            pages = arg
        elif opt == "-h":
            print(usage())
    launcher(url, pages)


# banner資訊
def banner():
    print('\033[1:34m ################################ \033[0m\n')
    print('\033[1:34m 3cH0 - Nu1L \033[0m\n')
    print('\033[1:34m ################################ \033[0m\n')


# 使用規則
def usage():
    print('-h: --help 幫助;')
    print('-u: --url 域名;')
    print('-p --pages 頁數;')
    print('eg: python -u "www.baidu.com" -p 100' + '\n')
    sys.exit()


# 漏洞回撥函式
def launcher(url, pages):
    email_num = []
    key_words = ['email', 'mail', 'mailbox', '郵件', '郵箱', 'postbox']
    for page in range(1, int(pages)+1):
        for key_word in key_words:
            bing_emails = bing_search(url, page, key_word)
            baidu_emails = baidu_search(url, page, key_word)
            sum_emails = bing_emails + baidu_emails
            for email in sum_emails:
                if email in email_num:
                    pass
                else:
                    print(email)
                    with open('data.txt', 'a+')as f:
                        f.write(email + '\n')
                    email_num.append(email)


# Bing_search
def bing_search(url, page, key_word):
    referer = "http://cn.bing.com/search?q=email+site%3abaidu.com&sp=-1&pq=emailsite%3abaidu.com&first=1&FORM=PERE1"
    conn = requests.session()
    bing_url = "http://cn.bing.com/search?q=" + key_word + "+site%3a" + url + "&qa=n&sp=-1&pq=" + key_word + "site%3a" + url +"&first=" + str((page-1)*10) + "&FORM=PERE1"
    conn.get('http://cn.bing.com', headers=headers(referer))
    r = conn.get(bing_url, stream=True, headers=headers(referer), timeout=8)
    emails = search_email(r.text)
    return emails


# Baidu_search
def baidu_search(url, page, key_word):
    email_list = []
    emails = []
    referer = "https://www.baidu.com/s?wd=email+site%3Abaidu.com&pn=1"
    baidu_url = "https://www.baidu.com/s?wd=" + key_word + "+site%3A" + url + "&pn=" + str((page-1)*10)
    conn = requests.session()
    conn.get(baidu_url, headers=headers(referer))
    r = conn.get(baidu_url, headers=headers(referer))
    soup = BeautifulSoup(r.text, 'lxml')
    tagh3 = soup.find_all('h3')
    for h3 in tagh3:
        href = h3.find('a').get('href')
        try:
            r = requests.get(href, headers=headers(referer))
            emails = search_email(r.text)
        except Exception as e:
            pass
        for email in emails:
            email_list.append(email)
    return email_list


# search_email
def search_email(html):
    emails = re.findall(r"[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]" + html, re.I)
    return emails


# headers(referer)
def headers(referer):
    headers = {
        'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36',
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8',
        'Accept-Encoding': 'gzip, deflate, br',
        'Referer': referer
    }
    return headers


if __name__ == '__main__':
    # 定義異常
    try:
        start(sys.argv[1: ])
    except:
        print("interrupted by user, killing all threads ... ")

 

參考:

《Python安全攻防-滲透測試實戰指南》——By MS08067 Team

相關文章