Disclaimer: This article is only for study and research. It is prohibited to use it for illegal purposes. Otherwise, the consequences will be at your own risk. If there is any infringement, please inform and delete it, thank you!
Project scene:
This time I bring you the selenium automatic login of Zhihu, using the qq authorized login entrance, link here , I tried to use the login API interface to simulate login before, but it didn't work~, so it's better to log in with the browser first.
solution:
1.OK, let's try it now, click the qq authorization icon, then click the account password to log in, after entering the qq account password, the slider type verification code that fills the gap will pop up.
2. At first glance, it is a headache to fill in the verification code of the missing block. This website is not bad, there is no detection of biological behavior characteristics, just calculate the distance that the slider needs to slide. The specific verification ideas are as follows:
- First obtain two pictures, one is a complete picture with a gap, and the other is a missing block picture.
- Then perform binarization.
- Then use cv2.matchTemplate to match the position of the gap map where the slider is located, return the distance x, and then go to the browser to get the actual sliding distance distance.
- The x obtained at this time is not the distance of selenium to simulate the sliding, you need to measure a few more groups (x,distance), and then use Curve Fitting Tool , to calculate the formula for the actual sliding distance required.
- The final distance obtained is the actual sliding distance of selenium.
3. Let’s talk about how to get the two pictures first. The addresses of the two pictures are OK.
4. Calculate the missing block position.
def get_diff_location(): # Get the image and grayscale it block = cv2.imread("block.jpg", 0) # Missing block picture index = cv2.imread("index.jpg", 0) # Background picture # Image name after binarization block1 = "block1.jpg" index1 = "index1.jpg" # Save the binarized image cv2.imwrite(block1, block) cv2.imwrite(index1, index) block = cv2.imread(block1) block = cv2.cvtColor(block, cv2.COLOR_RGB2GRAY) block = abs(255 - block) cv2.imwrite(block1, block) block = cv2.imread(block1) template = cv2.imread(index1) # get offset result = cv2.matchTemplate(block, template, cv2.TM_CCOEFF_NORMED) # Find the position of the block in the template, and the returned result is a matrix, which is the matching result of each point x, y = np.unravel_index(result.argmax(), result.shape) # print("Offset in x direction", int(y * 0.4 + 18), 'x:', x, 'y:', y) return y
5. To get the actual sliding distance, subtract the two values.
6. Then, in steps 4 and 5, take a few more sets of data and put them on the curve fitting website to output the conversion formula. Finally, we use selenium to test it, usually 1-3 times will be successful!
7. Finally post the complete code!
import cv2 import numpy as np import time import requests from selenium import webdriver from selenium.webdriver import ActionChains from selenium.webdriver.common.desired_capabilities import DesiredCapabilities def get_diff_location(): # Get the image and grayscale it block = cv2.imread("block.jpg", 0) # Missing block picture index = cv2.imread("index.jpg", 0) # Background picture # Image name after binarization block1 = "block1.jpg" index1 = "index1.jpg" # Save the binarized image cv2.imwrite(block1, block) cv2.imwrite(index1, index) block = cv2.imread(block1) block = cv2.cvtColor(block, cv2.COLOR_RGB2GRAY) block = abs(255 - block) cv2.imwrite(block1, block) block = cv2.imread(block1) template = cv2.imread(index1) # get offset result = cv2.matchTemplate(block, template, cv2.TM_CCOEFF_NORMED) # Find the position of the block in the template, and the returned result is a matrix, which is the matching result of each point x, y = np.unravel_index(result.argmax(), result.shape) # print("Offset in x direction", int(y * 0.4 + 18), 'x:', x, 'y:', y) return y def run(): url = 'https://www.zhihu.com/signin?next=%2F' option = webdriver.ChromeOptions() option.add_experimental_option('excludeSwitches', ['enable-automation']) # webdriver anti-detection option.add_argument("--no-sandbox") option.add_argument("--disable-dev-usage") desired_capabilities = DesiredCapabilities.CHROME # Modify page load strategy desired_capabilities["pageLoadStrategy"] = "none" # Commenting out these two lines will cause a delay in the final output, that is, wait for the page to load before outputting driver = webdriver.Chrome(options=option) driver.get(url) time.sleep(2) driver.find_element_by_xpath('//*[@id="root"]/div/main/div/div/div/div[3]/span[2]/button[2]').click() # Click qq to authorize login time.sleep(2) driver.switch_to.window(driver.window_handles[-1]) # toggle handle time.sleep(1) driver.switch_to.frame("ptlogin_iframe") driver.find_element_by_id('switcher_plogin').click() # Click to log in time.sleep(2) #Enter account password driver.find_element_by_id('u').send_keys('123123123') driver.find_element_by_id('p').send_keys('123123123') time.sleep(1) # Click to log in driver.find_element_by_id('login_button').click() time.sleep(3) driver.switch_to.frame('tcaptcha_iframe') while True: # Save background image with missing blocks with open('index.jpg','wb') as f: url = driver.find_element_by_id('slideBg').get_attribute('src') f.write(requests.get(url).content) # save block diagram with open('block.jpg','wb') as f: url = driver.find_element_by_id('slideBlock').get_attribute('src') f.write(requests.get(url).content) #get slider button = driver.find_element_by_id('tcaptcha_drag_thumb') # Slide the slider ActionChains(driver).click_and_hold(button).perform() x = get_diff_location() print('distance before fitting',x) if x > 500: # Refresh Code print('The distance before fitting is wrong') ActionChains(driver).release().perform() # release the mouse driver.find_element_by_id('e_reload').click() time.sleep(2) continue distance = int(-0.002886710239855681*x*x*x+4.044880174577657*x*x-1888.1544118978823*x+293800.78433441074) if (distance > 200) and (distance < 300): distance -= 100 elif distance > 300: print('distance error') # Refresh Code ActionChains(driver).release().perform() # release the mouse driver.find_element_by_id('e_reload').click() time.sleep(2) continue print('Approximate distance to slide',distance) ActionChains(driver).move_by_offset(xoffset=distance, yoffset=0).perform() time.sleep(1) ActionChains(driver).release().perform() # release the mouse time.sleep(2) # Check whether the swipe is successful try: driver.find_element_by_id('tcaptcha_drag_thumb') print('Swipe failed, will try again!') # Refresh Code driver.find_element_by_id('e_reload').click() time.sleep(2) except Exception: break print('Verification succeeded!') time.sleep(11111) if __name__ == '__main__': run()