Python Github automatic login, Github open source project analysis

I. Introduction

Github source link: https://github.com/Python3WebSpider/GithubLogin
Cui Qingcai: https://cuiqingcai.com/8229.html

When it comes to automatic login, using selenium is the most suitable, and I have actually fought several times before. Let's try to construct a post form and log in with a post request.

2. Analysis

) login process

Before that, we must first figure out what Cookies and Session are. Both Cookies and sessions store the information of individual users. The difference is that Cookies are stored on the client side, that is, their own computer, while sessions are stored on the server side.

Since all user information is stored, why are there two things?

First, the Cookies stored on the client can share the work of the server. Second, the SessionID information is stored in the Cookies. After the post request, the Cookies will be sent to the server. The server will find the corresponding Session according to the SessionID, and the login will be successful after finding it.

The problem with Cookies can use requests.Session() to maintain a session, which can handle Cookies automatically.

) to construct the form

In order to figure out what the content of the form is and how the information in it is related to the previous request, first clear the github-related Cookies in the browser, and then visit the login page
Face: https://github.com/login. Check the session after logging in:


After logging in several times, you can find that the login and password fields are the username and password entered before, the authentication_token field is messy and different each time, and the timestamp is the timestamp, and other fields are the same. So the only thing to solve is the authentication_token field.

Since this field is also submitted in the form, it may be the same as the username and password, and it is available on the login page (https://github.com/login).

If you clicked to save the log before, you can also see the login request and the corresponding.

But couldn't read the code for the login page, so I used Fidder, which is useful when analyzing multiple web pages.
(About the installation and use of Fidder: https://www.bookstack.cn/read/piaosanlang-spiders/c10f6ef032db5d49.md)


Observe the request and response, and there is no information about the authentication_token field. Then take a look at the page code to see if there is:


found it! The name attribute is the input node, and its value attribute is the information corresponding to this field. Compare with the form in the session:


It's the same, so when constructing the form, you only need to parse/login the web page and find the value corresponding to the authentication_token.

3. Code

) related dependencies

from pyquery.pyquery import PyQuery as pq
from requests.packages import urllib3
import requests

)Constructor

Define a Github class and initialize some variables

class Github:
    def __init__(self):
        """
        initialization

        """
        urllib3.disable_warnings()
        self.login_url = 'https://github.com/login'
        self.post_url = 'https://github.com/session'
        # Homepage
        self.self_url = 'https://github.com/Pineapple666'
        self.headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0'
        }
        self.session = requests.Session()

) to get field information

    def token(self):
        """
        Obtain authenticity_token Field information

        :return: None
        """
        response = self.session.get(url=self.login_url, headers=self.headers, verify=False)
        if response.status_code == 200:
            html = response.text
            doc = pq(html)
            return doc('input[name="authenticity_token"]').attr('value')
        else:
            print(f'response status code = {response.status_code}')

Return this function in the following construction form

) construct the form and log in

    def login(self, username, password):
        """
        submit form, log in

        :param username: username
        :param password: password
        :return: None
        """
        post_data = {
            'webauthn-iuvpaa-support': 'supported',
            'authenticity_token': self.token(),
            'webauthn-support': 'supported',
            'required_field_b538': None,
            'password': password,
            'commit': 'Sign in',
            'login': username,
            'return_to': None,
        }
        response = self.session.post(url=self.post_url, data=post_data, headers=self.headers)
        if response.status_code == 200:
            print('login in successful!')
        else:
            print(f'response status code = {response.status_code}')

) test login success

    def test(self):
        """
        Test login success, output user ID

        :return: None
        """
        response = self.session.get(url=self.self_url, headers=self.headers, verify=False)
        if response.status_code == 200:
            html = response.text
            doc = pq(html)
            id = doc('span[itemprop="additionalName"]').text()
            print(f'your id is {id}')
        else:
            print(f'response status code = {response.status_code}')

) full code

# -*- coding: utf-8 -*-
"""
@author     :Pineapple

@Blog       :https://blog.csdn.net/pineapple_C

@contact    :cppjavapython@foxmail.com

@time       :2020/8/19 8:50

@file       :github.py

@desc       :
"""
from pyquery.pyquery import PyQuery as pq
from requests.packages import urllib3
import requests


class Github:
    def __init__(self):
        """
        initialization

        """
        urllib3.disable_warnings()
        self.login_url = 'https://github.com/login'
        self.post_url = 'https://github.com/session'
        # Homepage
        self.self_url = 'https://github.com/Pineapple666'
        self.headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0'
        }
        self.session = requests.Session()

    def login(self, username, password):
        """
        submit form, log in

        :param username: username
        :param password: password
        :return: None
        """
        post_data = {
            'webauthn-iuvpaa-support': 'supported',
            'authenticity_token': self.token(),
            'webauthn-support': 'supported',
            'required_field_b538': None,
            'password': password,
            'commit': 'Sign in',
            'login': username,
            'return_to': None,
        }
        response = self.session.post(url=self.post_url, data=post_data, headers=self.headers)
        if response.status_code == 200:
            print('login in successful!')
        else:
            print(f'response status code = {response.status_code}')

    def token(self):
        """
        Obtain authenticity_token Field information

        :return: None
        """
        response = self.session.get(url=self.login_url, headers=self.headers, verify=False)
        if response.status_code == 200:
            html = response.text
            doc = pq(html)
            return doc('input[name="authenticity_token"]').attr('value')
        else:
            print(f'response status code = {response.status_code}')

    def test(self):
        """
        Test login success, output user ID

        :return: None
        """
        response = self.session.get(url=self.self_url, headers=self.headers, verify=False)
        if response.status_code == 200:
            html = response.text
            doc = pq(html)
            id = doc('span[itemprop="additionalName"]').text()
            print(f'your id is {id}')
        else:
            print(f'response status code = {response.status_code}')


if __name__ == '__main__':
    username = input('username:')
    password = input('password:')
    github = Github()
    github.login(username, password)
    github.test()

) output

If there are any mistakes, please send a private message to correct them!
Technology never ends, thank you for your support!

Tags: Python

Posted by computerzworld on Sat, 21 May 2022 20:30:28 +0300