Python_ OSS object storage of headline items

Qiniu cloud storage

demand

In headline items, data such as user avatars, article pictures and so on need to be saved by file storage system

programme

  • Build your own file system service
  • Select third-party object storage services

We used qiniu cloud object storage service in the headline project http://www.qiniu.com.

use

  1. register
  2. New storage space
  3. Complete the code implementation using qiniu SDK

Qiniu Python SDK website https://developer.qiniu.com/kodo/sdk/1242/python

Install SDK

pip install qiniu

code

Reference example of upload code provided by qiniu

from qiniu import Auth, put_file, etag
import qiniu.config

#You need to fill in your Access Key and Secret Key
access_key = 'Access_Key'
secret_key = 'Secret_Key'

#Build authentication object
q = Auth(access_key, secret_key)

#Space to upload
bucket_name = 'Bucket_Name'

#File name saved after uploading
key = 'my-python-logo.png'

#Generate upload Token, and specify expiration time, etc
token = q.upload_token(bucket_name, key, 3600)

#Local path of the file to upload
localfile = './sync/bbb.jpg'

ret, info = put_file(token, key, localfile)
print(info)
assert ret['key'] == key
assert ret['hash'] == etag(localfile)

Headline project implementation

from qiniu import Auth, put_file, etag, put_data
import qiniu.config
from flask import current_app


def upload_image(file_data):
    """
    Upload pictures to qiniu
    :param file_data: bytes file
    :return: file_name
    """
    # You need to fill in your Access Key and Secret Key
    access_key = current_app.config['QINIU_ACCESS_KEY']
    secret_key = current_app.config['QINIU_SECRET_KEY']

    # Build authentication object
    q = Auth(access_key, secret_key)

    # Space to upload
    bucket_name = current_app.config['QINIU_BUCKET_NAME']

    # File name saved after uploading to qiniu
    # Key = 'my Python - seven cattle png'
    key = None

    # Generate upload Token, and specify expiration time, etc
    token = q.upload_token(bucket_name, expires=1800)

    # # The local path of the file to upload
    # localfile = '/Users/jemy/Documents/qiniu.png'

    # ret, info = put_file(token, key, localfile)
    ret, info = put_data(token, key, file_data)

    return ret['key']

Interface for uploading avatar and ID card picture

In common / utils / parser py

import imghdr

def image_file(value):
    """
    Check whether it is a picture file
    :param value:
    :return:
    """
    try:
        file_type = imghdr.what(value)
    except Exception:
        raise ValueError('Invalid image.')
    else:
        if not file_type:
            raise ValueError('Invalid image.')
        else:
            return value

toutiao/resources/user/profle.py

class PhotoResource(Resource):
    """
    User image (avatar, ID card)
    """
    method_decorators = [login_required]

    def patch(self):
        file_parser = RequestParser()
        file_parser.add_argument('photo', type=parser.image_file, required=False, location='files')
        file_parser.add_argument('id_card_front', type=parser.image_file, required=False, location='files')
        file_parser.add_argument('id_card_back', type=parser.image_file, required=False, location='files')
        file_parser.add_argument('id_card_handheld', type=parser.image_file, required=False, location='files')
        files = file_parser.parse_args()

        user_id = g.user_id
        new_user_values = {}
        new_profile_values = {}
        return_values = {'id': user_id}

        if files.photo:
            try:
                photo_url = upload_image(files.photo.read())
            except Exception as e:
                current_app.logger.error('upload failed {}'.format(e))
                return {'message': 'Uploading profile photo image failed.'}, 507
            new_user_values['profile_photo'] = photo_url
            return_values['photo'] = current_app.config['QINIU_DOMAIN'] + photo_url
            need_delete_profile = True

        if files.id_card_front:
            try:
                id_card_front_url = upload_image(files.id_card_front.read())
            except Exception as e:
                current_app.logger.error('upload failed {}'.format(e))
                return {'message': 'Uploading id_card_front image failed.'}, 507
            new_profile_values['id_card_front'] = id_card_front_url
            return_values['id_card_front'] = current_app.config['QINIU_DOMAIN'] + id_card_front_url

        if files.id_card_back:
            try:
                id_card_back_url = upload_image(files.id_card_back.read())
            except Exception as e:
                current_app.logger.error('upload failed {}'.format(e))
                return {'message': 'Uploading id_card_back image failed.'}, 507
            new_profile_values['id_card_back'] = id_card_back_url
            return_values['id_card_back'] = current_app.config['QINIU_DOMAIN'] + id_card_back_url

        if files.id_card_handheld:
            try:
                id_card_handheld_url = upload_image(files.id_card_handheld.read())
            except Exception as e:
                current_app.logger.error('upload failed {}'.format(e))
                return {'message': 'Uploading id_card_handheld image failed.'}, 507
            new_profile_values['id_card_handheld'] = id_card_handheld_url
            return_values['id_card_handheld'] = current_app.config['QINIU_DOMAIN'] + id_card_handheld_url

        if new_user_values:
            User.query.filter_by(id=user_id).update(new_user_values)
        if new_profile_values:
            UserProfile.query.filter_by(id=user_id).update(new_profile_values)

        db.session.commit()

        return return_values, 201

===================================

CDN

The advantage of using third-party OSS services is the integration of CDN services. Let's learn what CDN is.

CDN

Full name: Content Delivery Network or content distribution network, i.e. content distribution network

It is to distribute the content of the origin site to the node closest to the user, so that the user can obtain the required content nearby, and improve the response speed and success rate of user access. Solve the access delay caused by distribution, bandwidth and server performance. It is suitable for site acceleration, on-demand, live broadcast and other scenarios.

Basic ideas

Try to avoid bottlenecks and links on the Internet that may affect the speed and stability of data transmission, so as to make the content transmission faster and more stable. Through a layer of intelligent virtual network based on the existing Internet, which is composed of node servers placed everywhere in the network, the CDN system can redirect the user's request to the service node nearest to the user in real time according to the network traffic, the connection of each node, the load condition, the distance to the user and the response time.

objective

Solve the access delay caused by distribution, bandwidth and server performance. It is suitable for site acceleration, on-demand, live broadcast and other scenarios. Enable users to obtain the required content nearby, solve the situation of Internet congestion, and improve the response speed and success rate of users accessing the website.

Control delay is undoubtedly an important index of modern information technology. The intention of CDN is to reduce resources as much as possible and ensure the continuity of information under the conditions of forwarding, transmission and link jitter.

CDN is to play the role of escort and accelerator. It can trigger information faster, accurately and ruthlessly and reach every user, bringing a more extreme use experience.

Basic principles

The simplest CDN network consists of one DNS server and several cache servers:

  1. When the user clicks the content URL on the website page, after being resolved by the local DNS system, the DNS system will finally hand over the resolution right of the domain name to the CDN dedicated DNS server pointed by CNAME.
  2. The DNS server of the CDN returns the global load balancing device IP address of the CDN to the user.
  3. The user initiates a content URL access request to the global load balancing device of the CDN.
  4. The CDN global load balancing device selects a regional load balancing device in the user's region according to the user's IP address and the content URL requested by the user, and tells the user to send a request to this device.
  5. The regional load balancing device will select a suitable cache server for the user to provide services. The selection basis includes: judging which server is closest to the user according to the user's IP address; Judge which server has the content required by the user according to the content name carried in the URL requested by the user; Query the current load of each server to determine which server still has service capacity. After the comprehensive analysis based on the above conditions, the regional load balancing device will return the IP address of a cache server to the global load balancing device.
  6. The global load balancing device returns the IP address of the server to the user.
  7. The user sends a request to the cache server, and the cache server responds to the user's request and transmits the content required by the user to the user terminal. If the cache server does not have the content that the user wants, and the regional balancing device still assigns it to the user, the server will request the content from its upper cache server until the source server of the website pulls the content locally.

common problem

1. Is CDN acceleration for the server where the website is located or for its domain name?

CDN is to accelerate only a specific domain name of the website. If the same website has multiple domain names, visitors can access the domain names that have joined the CDN to obtain the acceleration effect, access the domain names that have not joined the CDN, or directly access the IP address, they can not obtain the CDN effect.

2. What are the advantages of CDN compared with mirror sites?

CDN is completely transparent to the visitors of the website. There is no need for visitors to manually select the mirror site to visit, which ensures the friendliness of the website to visitors. CDN checks the availability of each node, and unqualified nodes will be picked out at the first time, thus ensuring a very high availability, which cannot be achieved by the mirror site. Simply deploy the CDN without making any changes to the original station.

3. What are the advantages of CDN compared with double line machine room?

The common two-wire computer room can only solve the problem of slow mutual access between Netcom and Telecom, and the problem of interworking of other ISPs (such as education network, mobile network and Tietong) has not been solved. The CDN is for visitors to get data nearby, and the nodes of the CDN are all over ISPs, so as to ensure the access speed of the website to any ISP. In addition, CDN naturally obtains the ability to resist network attacks because of its principle of traffic shunting to each node.

4. After CDN is used, does the original website need to be modified and what modifications should be made?

Generally speaking, websites can use CDN to achieve acceleration effect without any modification. Only a few modifications need to be made to the program that needs to judge the guest IP.

5. Why do I see a web page or an old web page after updating my website through CDN? How to solve it?  

Because CDN adopts the caching mechanism of each node, after the static web pages and pictures of the website are modified, if the CDN cache is not updated accordingly, the old web pages will be seen. To solve this problem, the CDN management panel provides a URL push service to notify CDN nodes to refresh their own cache. In the URL push address bar, enter the specific web address or picture address, and the cached content in each node will be deleted uniformly and take effect immediately. If you need to push too many URLs and pictures, you can choose directory push and enter http://www.kkk.com/news That is, you can refresh all pages and pictures under the news directory of the website.

6. Can CDN not cache some web pages and pictures with high timeliness requirements?

Just use dynamic pages, asp, php, jsp and other dynamic technologies. The pages made of dynamic technologies are not cached by CDN and do not need to be refreshed every time. Or use a website with two domain names, one with CDN enabled and the other without CDN. The pages and pictures with high requirements for timeliness are placed under the domain name without CDN.

7. The website has added many new pages and pictures. Do you need to use URL push?  

Later added web pages and pictures do not need to use URL push, because they do not exist in the cache.

8. After the website uses CDN, some regions report that they can't be accessed. What should we do?

After CDN is enabled, there are many possibilities that visitors cannot access the website. It may be the problem of CDN, the failure of the source site or the shutdown of the source site, the problem of the visitor's own network, or even the poisoning of the customer's own computer during our actual troubleshooting. When customers report faults, they can contact our 24-hour technology department for handling at any time.

9. Under what circumstances is CDN recommended?

Generally speaking, websites that focus on information and content have a certain amount of access. Websites that focus on dynamic content, such as information websites, government agency websites, industry platform websites, shopping malls, forums, blogs, making friends, SNS, online games, search / query, finance, etc. Website software developers, content service providers, online game operators, source code downloads and other websites with a large number of streaming media on demand applications, telecom operators with video on demand platform, content service providers, sports channels, broadband channels, online education, video blogs, etc

Posted by catreya on Tue, 24 May 2022 01:06:02 +0300