I'm stupid, really, I just know

background

A meteorological data display project done by the company involves a lot of data analysis in the original format, such as grid meteorological data. The internal data is generally a two-dimensional array, which is generally stored in binary mode.

The essence of grid data can be understood as a picture, and each pixel has a data value.

task

I took over the parsing work to convert the data in binary format into a more general text format for easy viewing and display.
Another additional task is that the density of the original data is high, i.e. 1km × 1km, 435km in the vertical direction and 355km in the horizontal direction, involving a total of 435x355=154425 points. Due to the lack of awesome in front-end optimization, the density can only be degraded to 5km × 5km.

Parsing code

The so-called binary data is to code the data one by one, so the code is not difficult. It can be completed by knocking for a while:

// InputStream in;  Stream object
float[][] data = new float[HEIGHT][WIDTH];//Receive data using float array
byte[] buf = new byte[2];//buf
BufferedInputStream bin = new BufferedInputStream(in);//Using cache stream objects
for(int y=0; y<HEIGHT; y++) {//Read line by line from the top left
    for(int x=0; x<WIDTH; x++) {
          data[y][x] = read(bin, buf);//Read a data
    }
    skip(bin,WIDTH*5*2*4);//Skip 4 lines down
}

problem

The data looks like this:

After I parsed it, I put the results into the display interface to view. The parsed results are striped data.

And the crunching thing is:
But if I don't downgrade the density, it's normal again (analyze the original width and height one by one without skipping).

process

In the middle is a painful trial and error process. I try to print the position of the current stream, because this shape looks like a dislocation. However, I record it through a position and find it is normal.

Turnaround

In the process of passing, I told myself from time to time that there must be something wrong with what was written.
By chance, when I created the cache stream, I added a parameter size=3550 (just the size of 5 rows)

BufferedInputStream bin = new BufferedInputStream(in, 3550);

Ah!

That's what I want.
It reminds me of that song - "my skateboarding shoes".

reason

In fact, it's very simple. The problem lies in

skip(bin,WIDTH*5*2*4);//Skip 4 lines down

And I wrote it like this (to avoid throwing checkedException)

try {
      inputStream.skip(n);
}catch (Exception ex){
      throw new RuntimeException("reading data error:", ex);
}

In other words, I think this skip will ensure that the number of bytes needed to skip is actually skipped, and then find the skip method of BufferedInputStream

      public synchronized long skip(long n) throws IOException {
        this.getBufIfOpen();
        if (n <= 0L) {
            return 0L;
        } else {
            long avail = (long)(this.count - this.pos);
            if (avail <= 0L) {
                if (this.markpos < 0) {
                    return this.getInIfOpen().skip(n);
                }

                this.fill();
                avail = (long)(this.count - this.pos);
                if (avail <= 0L) {
                    return 0L;
                }
            }

            long skipped = avail < n ? avail : n;
            this.pos = (int)((long)this.pos + skipped);
            return skipped;
        }
    }

As you can see, BufferedInputStream does not ensure the number of bytes needed to skip -- if the skipped bytes exceed the current cache length, it will only skip to the end of the current cache.

Therefore, I thought it jumped to the X position, but in fact it was still where it was - so that's why the stripes came out.

solve

If you know the reason, there are more solutions

  • Parameter adjustment method: write 3550 as the parameter above to ensure that it just jumps to the specified position;
  • The do while loop ensures that when it is less than the number of skips, it continues to jump forward
      do{
            n -= inputStream.skip(n);
      }while(n>0);
  • Do not use BufferedInputStream
    Directly using the original FileInputStream to read does not have this problem. When skip moves on the final byte stream, it will be true and effective.

Finally, 3550 parameters are adopted. At the same time, do while judgment is also made to prevent possible problems in the future.

summary

As the title says, I'm stupid. Really, I only know InputStream Read, the read length may not be enough, but I don't know that the skip length will not be enough.
The so-called weak foundation will shake the earth and the mountains. It is very important to strengthen learning and strengthen the foundation!

Tags: Java

Posted by leandro on Mon, 16 May 2022 01:31:24 +0300