python模拟登陆之下载

目录 头条资讯

好长时间没有更新博客了,哈哈。

今天公司给了这么一个需求,现在我们需要去淘宝获取上一天的订单号,然后再根据订单号去另一个接口去获取订单详情,然后再给我展示到web!

中间涉及到的技术点有:

  • 模拟登陆
  • 模拟下载
  • 解析exal文件数据流
  • 读取exal文件,拿出订单号
  • 还有最后一点请求接口

下面就给大家挨个说一下,刚拿到需求其实还是很模糊的,因为一个都没做过,等静下心来去理解的时候,发现并没有那么难,反而很简单

模拟登陆

一、分析页面请求头

本次登陆地址是https://huoche.alitrip.com/hello.htm

1、先登陆了一遍查看了一下请求头,发现就携带了三个东西,隐藏token,用户名,密码

 

一看一目了然,就一个后台页面,可想而知相对来说还是很简单,哈哈,下一步我只需要封装一下cookie,然后带上tocken,username,passwd去登陆咯

给大家说下,python的requests模块可以忽略cookie,自己创建一个session对象,他自己去给咱们匹配cookie,不用去挨个试cookie,这样就节省了好多代码和时间

2、代码如下

class TbTomas(object):
    def __init__(self):
        # 配置初始化
        self.session_obj = requests.session()

    def download_file(self,thomas_username,thomas_password,):
        hello_url = 'https://huoche.alitrip.com/hello.htm'
        # 获取原文
        hello_response = self.session_obj.get(hello_url)
        # 正则匹配原文
        h_u_s = re_search('<input type="hidden" id="h_u_s" name="h_u_s" value="(.*?)">', hello_response.text)
        
        h_u_s = base64.b64encode(h_u_s)
        headers = {
            'Accept': 'text/html, application/xhtml+xml, image/jxr, */*',
            'Referer': 'https://huoche.alitrip.com/hello.htm',
            'Accept-Language': 'zh-CN',

            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586',
            'Content-Type': 'application/x-www-form-urlencoded',

            'Accept-Encoding': 'gzip, deflate',
            'Host': 'huoche.alitrip.com',
            'Content-Length': '73',
            'Connection': 'Keep-Alive',
            'Cache-Control': 'no-cache'
        }

        post_data = {
            'h_u_s': base64.b64encode(h_u_s),
            'h_u_n': thomas_username,
            'h_u_p': base64.b64encode(thomas_password)
        }
        index_url = 'https://huoche.alitrip.com/index.htm'
        index_response = self.session_obj.post(index_url, headers=headers, data=post_data)

最后一提交post请求,就可以判断有没有登录成功了,是不是很简单,哈哈!

数据下载

下载也是和登录是一样的道理,下载的时候肯定也是像网页发一个post请求,然后就回去下载exal文件咯,python有这么一个模块xlrd,可以去操作exal文件,非常方便

1、原文是让我们输入时间看,下载那一天的数据,领导给的任务是下载前一天的,所以上一天时间要写几行代码来实现

代码如下:

today = datetime.datetime.now()
yesterday = today + datetime.timedelta(days=-1)
trade_date = yesterday.strftime('%Y-%m-%d')

2、查看下载文件请求的url,以及提交的数据,一张图一切都明白了

从图中可以看到,该文发送的url,请求方式,请求头,和返回的数据

3、模拟请求下载,只需用提交一下日期就OK搞定,文件下载完毕,接下开要读文件拿自己想要的东西啦

        post_data = {
            'orderExportDate': trade_date
        }
        sheet_content = ""
        for _ in xrange(3):
            try:
                # 得到exal文件流
                download_response = self.session_obj.post(download_url, data=post_data)
                # 打开exal文件
                xls_content = xlrd.open_workbook(file_contents=download_response.content)
                sheet_content = xls_content.sheets()[0]
                break
            except Exception as e:
                continue

4、这个就众所周知,和读取文件一样,for循环一行一行读取,然后把订单号挨个添加给一个列表啥啦乱七八糟的

        order_item = []
        for line_num in range(sheet_content.nrows):
            line_item = sheet_content.row_values(line_num)
            if line_item[2]:
                order_item.append(line_item[2], )  # 订单号 order_no
        # 获取到所有订单号
        order_item = order_item[1:]

拿到订单号要去获取订单详情了,但是领导给我说这个已经有同事写好代码了,只需要调用那个接口就好,所以别人的代码我就不往上面展示了,原理很简单

requests模块,请求url,get传入订单号,发送请求,就可以返回数据咯,web页面展示,那个需求,每个公司都不一样,存入数据库,自己取自己想要的吧。

本文就到这里吧,学到一点东西的请点赞,哈哈

最后附带源码,用户名和密码就不告诉大家啦,啊哈哈

#!/usr/bin/python
# coding:utf-8
import sys
import os
import django

reload(sys)
sys.setdefaultencoding('utf8')
sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))  # 把manage.py所在目录添加到系统目录
os.environ['DJANGO_SETTINGS_MODULE'] = 'business.settings'  # 设置setting文件
django.setup()  # 初始化Django环境

import requests
import re
import base64
import xlrd
import datetime
import time
import MySQLdb
from business import settings
from train.depends.platform import Platform
from train.models import TbTomasOrder,TbTomasEpay,TtTicketThomas,TbTomasLinkman
from train import utils
from train.status import OrderStatus
from django.db import IntegrityError


class TbTomas(object):
    def __init__(self):
        # 配置初始化
        self.session_obj = requests.session()

    def download_file(self,thomas_username,thomas_password,):
        hello_url = 'https://huoche.alitrip.com/hello.htm'
        # 获取原文
        hello_response = self.session_obj.get(hello_url)
        # 正则匹配原文
        h_u_s = re_search('<input type="hidden" id="h_u_s" name="h_u_s" value="(.*?)">', hello_response.text)

        h_u_s = base64.b64encode(h_u_s)
        headers = {
            'Accept': 'text/html, application/xhtml+xml, image/jxr, */*',
            'Referer': 'https://huoche.alitrip.com/hello.htm',
            'Accept-Language': 'zh-CN',

            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2486.0 Safari/537.36 Edge/13.10586',
            'Content-Type': 'application/x-www-form-urlencoded',

            'Accept-Encoding': 'gzip, deflate',
            'Host': 'huoche.alitrip.com',
            'Content-Length': '73',
            'Connection': 'Keep-Alive',
            'Cache-Control': 'no-cache'
        }

        post_data = {
            'h_u_s': base64.b64encode(h_u_s),
            'h_u_n': thomas_username,
            'h_u_p': base64.b64encode(thomas_password)
        }
        index_url = 'https://huoche.alitrip.com/index.htm'
        index_response = self.session_obj.post(index_url, headers=headers, data=post_data)

        download_url = 'https://huoche.alitrip.com/orderlistexp.do'

        # 处理时间
        today = datetime.datetime.now()
        yesterday = today + datetime.timedelta(days=-1)
        trade_date = yesterday.strftime('%Y-%m-%d')

        post_data = {
            'orderExportDate': trade_date
        }
        sheet_content = ""
        for _ in xrange(3):
            try:
                # 得到exal文件流
                download_response = self.session_obj.post(download_url, data=post_data)
                # 打开exal文件
                xls_content = xlrd.open_workbook(file_contents=download_response.content)
                sheet_content = xls_content.sheets()[0]
                break
            except Exception as e:
                continue
        order_item = []
        for line_num in range(sheet_content.nrows):
            line_item = sheet_content.row_values(line_num)
            if line_item[2]:
                order_item.append(line_item[2], )  # 订单号 order_no
        # 获取到所有订单号
        order_item = order_item[1:]

        # 根据订单号去拿订单详情
        self.create_order_info(order_item)

    def create_order_info(self, order_item):
        platform_obj = Platform()
        for order in order_item:
            order_info = platform_obj.get_order(order)
            # 插入order表

            self.insert_order(order_info,order)

            # 插入epay表

            self.insert_epay(order_info,order)

            # 插入ticket表
            self.insert_ticket(order_info,order)

            # 插入联系人
            self.linkman(order_info,order)

            time.sleep(0.5)

    def insert_order(self,order_info,order_no):
        order_params = order_info.get('order')
        start_train_date = order_params.get('start_train_date')
        order_pull_time = datetime.datetime.now()
        submit_time = datetime.datetime.now()
        latest_issue_time = order_params.get('latest_issue_time')
        arrive_time = str(order_params.get('arrive_time')).split(" ")[1]
        passenger = order_params.get('passenger')[0]

        create_order = {}
        create_order['order_no'] = order_no
        create_order['ttp_order_id'] = order_params.get('ttp_order_id')
        create_order['order_type'] = order_params.get('order_type')
        create_order['segment_code'] = int(order_params.get('segment_code'))
        create_order['from_station_telecode'] = order_params.get('from_station_telecode')
        create_order['to_station_telecode'] = order_params.get('to_station_telecode')
        create_order['start_train_date'] = start_train_date
        create_order['station_train_code'] = order_params.get('station_train_code')
        create_order['latest_issue_time'] = latest_issue_time
        create_order['order_pull_time'] = order_pull_time
        create_order['from_station_name'] = order_params.get('from_station_name')
        create_order['to_station_name'] = order_params.get('to_station_name')
        create_order['arrive_time'] = arrive_time
        create_order['seat_type_code'] = passenger.get('seat_type_code')
        create_order['seat_type_name'] = passenger.get('seat_type_name')
        create_order['platform_seat_type_code'] = passenger.get('platform_seat_type_code')
        create_order['platform_seat_type_name'] = passenger.get('platform_seat_type_name')
        create_order['platform_ticket_price'] = passenger.get('platform_ticket_price', 0)
        create_order['ext_seat'] = MySQLdb.escape_string(passenger.get('ext_seat'))
        create_order['submit_time'] = submit_time
        create_order['status'] = OrderStatus.STATUS_NEW
        create_order['platform_ticket_price_all'] = order_params.get('ticket_price_all')
        create_order['created'] = utils.now()
        create_order['updated'] = utils.now()
        try:
            TbTomasOrder.objects.create(**create_order)
        except IntegrityError:
            pass

    def insert_epay(self,order_info,order_no):
        # 支付信息
        order_params = order_info.get('order')
        passenger = order_params.get('passenger')
        alipayInfos = order_params.get('tb_extend_params').get('alipayInfos')

        for pa in passenger:
            create_eapy = {}
            create_eapy['order_no'] = order_no
            create_eapy['apply_id'] = pa.get('apply_id')
            create_eapy['alipay_trade_no'] = alipayInfos[0].get('alipayTradeNO')
            create_eapy['pay_account'] = alipayInfos[0].get('alipayNO')
            create_eapy['status'] = OrderStatus.STATUS_NEW
            create_eapy['created'] = utils.now()
            create_eapy['updated'] = utils.now()
            try:
                TbTomasEpay.objects.create(**create_eapy)
            except IntegrityError:
                pass

    def insert_ticket(self,order_info,order_no):
        order_params = order_info.get('order')
        segment_code = order_params.get('segment_code')
        sequence_no = order_params.get('sequence_no')
        station_train_code = order_params.get('station_train_code')

        passenger = order_params.get('passenger')
        for pa in passenger:
            pa['order_no'] = order_no
            pa['ticket_no'] = ''
            pa['segment_code'] = segment_code
            pa['sequence_no'] = sequence_no
            pa['station_train_code'] = station_train_code
            pa['from_station_telecode'] = order_params.get('from_station_telecode')
            pa['to_station_telecode'] = order_params.get('to_station_telecode')
            pa['from_station_name'] = order_params.get('from_station_name')
            pa['to_station_name'] = order_params.get('to_station_name')
            pa['start_train_date'] = order_params.get('start_train_date')
            pa['status'] = OrderStatus.STATUS_NEW
            pa['ext_seat'] = MySQLdb.escape_string(pa['ext_seat'])
            if not pa['sequence_no']:
                pa['sequence_no'] = ''
            pa['platform_passenger_id'] = int(pa['platform_passenger_id']) if pa['platform_passenger_id'] else 0
            if not pa['latest_resign_time']:
                del pa['latest_resign_time']

            # 纸质票设置状态
            if order_params.get('order_login_type') == 'paper':
                pa['status'] = OrderStatus.STATUS_PAPER_WAIT_TICKET
            if 'insurance_number' in pa:
                del pa['insurance_number']
            if 'insurance_price' in pa:
                del pa['insurance_price']
            if 'student_info' in pa:
                del pa['student_info']
            try:
                TtTicketThomas.objects.create(**pa)
            except IntegrityError:
                pass

    def linkman(self,order_info,order_no):
        order_params = order_info.get('order')
        linkman = order_params.get('linkman')
        linkman['order_no'] = order_no
        linkman['created'] = datetime.datetime.now()
        linkman['updated'] = datetime.datetime.now()
        linkman['mailing'] = ""
        if "receiver_name" in linkman:
            del linkman['receiver_name']
        try:
            TbTomasLinkman.objects.create(**linkman)
        except IntegrityError:
            pass

    def run(self, username,passwd):
        # 登陆托马斯后台
        try:
            self.download_file(username,passwd)
        except Exception as e:
            pass



def re_search(regex, subject):
    subject = str(subject)
    obj = re.compile(regex)
    match = obj.search(subject)
    if match:
        result = match.group(1)
    else:
        result = ''
    return result


def main():
    username = base64.b64decode(settings.THOMAS_USERNAME)
    passwd = base64.b64decode(settings.THOMAS_PASSWORD)
    TbTomas().run(username,passwd)


if __name__ == "__main__":
    main()

demo