Home Bird Eye View Generation Using IPM (Inverse Perspective Mapping)
Post
Cancel

Bird Eye View Generation Using IPM (Inverse Perspective Mapping)

Bird Eye View Generation Using IPM (Inverse Perspective Mapping)

1. IPM

IPM is short for and its job is to convert Inverse Perspective Mapping 2D images into 3D format. Since it is expressed as , the final result is 2D and can therefore be expressed as a 2D image.

image

In the picture above, the left side is 2D image and the right side is BEV . The reason why the RGB area becomes larger from the bottom BEV to the top on the right is reflected by the angle of view taken by the actual camera. When light enters the camera lens, you can think of the area projected as an actual image as the angle of view area shown above.

IPM - If you look at the name Inverse Perspective Mapping, it is divided into two meanings.

First of all, Perspective Mapping it mean Perspective Projection. Perspective Projection is one of the methods of projecting a 3D scene onto a 2D image plane, which creates a 2D image with perspective, Perspective Projection like a commonly seen photograph. Even though objects are the same size, they appear larger when they are close and appear smaller when they are far away, which is a form of distortion that appears in Perspective Distortion.

Inverse means reversing Perspective Projection the process and Perspective removing the distortion caused by previously explained.

Therefore, IPM the process Perspective Distortion aims to consistently express the entire 3D scene regardless of the distance from the camera by eliminating , and the expression method is BEV expressed in a format.

The reason for expressing here is that when viewed as BEV , it can be expressed in the same size regardless of the perspective from the camera, and since the actual object in 3D is Depth unknown, all objects are attached to the ground without height. Because we have to make assumptions. In order to convert the information of a 2D image into 3D, Depth - the height information of the object can be known through (refer to the transformation relationship between point cloud and depth map), but since this information is not known, a method of ignoring the height of all objects BEV is selected. It will. When looking directly at the ground from the sky, the height can only be seen as an unknown form of perspective, and this was taken advantage of BEV.

In summary , it means the process of IPM, Inverse Perspective Mapping converting a 2D image into a 3D space, Perspective Distortion in order to remove and converting it to a shape Depth due to the absence of value .BEV

Because we proceed with this assumption as a prerequisite, limit situations have already occurred as shown in the figure below.

image

Due to the assumption that the height of all objects is 0 , which is a prerequisite , that is, all objects are attached to the ground without height, Perspective Distortion objects that appear normal without information on the actual ground (road, lane grass, etc.), but have height such as cars looks strange. That is, it cannot be represented normally because it violates the prerequisites.

Also, if the road itself is uphill or downhill, strange shapes occur because it violates the assumption that the height of all objects is 0.

Despite these disadvantages, Perspective Distortion it has the advantage of not being able to recognize information on the ground in an ideal flat ground environment, so it is sometimes used depending on the situation.

a. Background to IPM

The method we will look at now BEVis how to change the front view image to an image. IPMThis processing perspective effect uses a method of removing the camera and remapping it onto a top-view based 2D domain. In other words, BEV images have the property of maintaining distance and parallel lines by correcting the perspective effect.

The image we want to create perspective viewis created by converting the road in the image into a flat two-dimensional plane. thus ( X, AND , Y = 0) and is considered a two-dimensional plane with no height information.

image

As shown in the picture above, the same object can be expressed in different positions depending on which coordinate system is used. In the picture above Road, there are three different coordinate systems Vehicle, , and , and in general , in order to apply , you need to know the relationship between and , so information that can transform the two coordinate systems is obtained through , and through this value, the 3D scene of the coordinate system is converted into a 2D image.

In the above content , what expresses the positional relationship between vehicle and is called a parameter, and when the position of is set as the origin, what expresses the relationship between the three-dimensional information based on the camera and the pixel of the image is called a parameter. Please refer to camera calibration for this information.

b. Perspective projection

Perspective projection is a mapping of 3D space onto a 2D plane. During the mapping process, when two parallel lines in 3D space are expressed on a 2D plane, a phenomenon occurs in which the parallel lines meet at a specific point.

image

If you look at the lines in the picture above, you can see that they are parallel lines in 3D space, but the lines meet in 2D space.

image

The picture on the left of the picture above is a view of the road through a camera located on the top of the vehicle, and the picture on the right is the surrounding environment as seen from a BEV. The scene observed varies depending on whether you viewpoint view it as perspective view BEV

The important difference between the two views projective transformation is that when applied as shown on the left, parallel lines are no longer preserved. On the other hand BEV, you can see that the parallel lines remain the same. The previously mentioned Perspective Distortion has been removed.

The advantage explained in this article IPM is that the parallel components remain as parallel lines.

c. Camera Projective Geometry

image

image

The camera model perspective projection converts a 3D scene into a 2D image, and when creating a 2D image, the desired 2D image is created according to the values ​​obtained calibration through intrinsic extrinsic $[R|t]$

It serves to convert from to by indicating the state position and direction between world and camera world coordinate camera coordinate intrinsic K is camera coordinate related to how a 3D scene image coordinate is converted to a 2D image as a reference. K focal length has the and camera center component values, which are the components of the camera . (This corresponds to the simplest pinhole camera case pixel skew, and , lens distortion etc. are omitted)

image

If you refer to the process of through the picture above, the image acquired through the actual camera coordinate system has image plane1 (perspective view) the same form as and IPM the image you want to create through coordinate system world has the image plane2 (Bird Eye View) same form as . (For reference, world coordinate system the meaning of the picture above is and The axes are expressed in reverse. Future explanations and code will use the right-handed coordinate system.)

image

The method to be explained in the future is a right-hand coordinate system as shown in the picture above, so XZthe left direction of the plane is and. This is the positive direction of the axis.

As mentioned earlier, to apply IPM, we assumed that the road was flat on the ground. Therefore, world coordinate for every point on the road Height = 0 must be satisfied.

So if the road is not flat, because the prerequisite is not met, the results will appear distorted. For example, the road is curved or the lanes are not straight. This can happen because the road is not flat on the ground.

In addition, extrinsic, if this is not correct as we will see in the future. Since it does not satisfy, distortion may occur for the same reason.

Lastly, intrinsic if this is incorrect, distortion may occur because it is not possible to image plane 1 obtain the correct RGB value of and correspond to image plane 2.

Therefore, if all three conditions of ① flat ground, ② extrinsic accuracy, and ③ accuracy are satisfied , you can create an image using .intrinsic IPM BEV

2. How to apply IPM

So far, Mwe have looked at the background knowledge for application, and now we will look at the process for actually applying it.

In order to apply IPM, the following four processes are largely required.

  • Calibraition information
  • Determine the relationship between BEV images and world coordinates
  • Obtain LUT (Look Up Table) between BEV image and Image coordinates
  • Create an image backward by IPM processing it BEV

a. Read calibraition information

The data we are dealing with is the data set , and you can use the information cityscapes below. The information on the data acquisition vehicle provided by calibration is as follows.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
    "baseline": 0.21409619719999115,
    "pitch": 0.03842560000000292,
    "roll": 0.0,
    "x": 1.7,
    "y": 0.026239999999999368,
    "yaw": -0.009726800000000934,
    "z": 1.212400000000026,

    "fx": 2263.54773399985,
    "fy": 2250.3728170599807,
    "u0": 1079.0175620000632,
    "v0": 515.0066006000195
}

From the above information, use roll, pitch, to extract the value converted to , and use , , to extract the value converted to yaw Vehicle → Camera Rotation x y z Vehicle → Camera Translation

roll Please refer to the information below for how to extract the value converted to using pitch, , .yawVehicle → CameraRotation Matrix Rotation Transform

In cityscapes, all roll, pitch, yaw and x, y, z are Camera → Vehicle defined as a relationship, so extrinsic you just need to find the in the opposite direction. Rotation In the case of Transpose, you can take Inverse, so Inverse you can take as shown in the code below Translation. In the case of , you can move in the negative direction, so apply a negative number. For convenience of calculation, it is expressed using homogeneous coordinates.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
def rotation_from_euler(roll=1., pitch=1., yaw=1.):
    """
    Get rotation matrix
    Args:
        roll, pitch, yaw:       In radians

    Returns:
        R:          [4, 4]
    """
    si, sj, sk = np.sin(roll), np.sin(pitch), np.sin(yaw)
    ci, cj, ck = np.cos(roll), np.cos(pitch), np.cos(yaw)
    cc, cs = ci * ck, ci * sk
    sc, ss = si * ck, si * sk

    R = np.identity(4)
    R[0, 0] = cj * ck
    R[0, 1] = sj * sc - cs
    R[0, 2] = sj * cc + ss
    R[1, 0] = cj * sk
    R[1, 1] = sj * ss + cc
    R[1, 2] = sj * cs - sc
    R[2, 0] = -sj
    R[2, 1] = cj * si
    R[2, 2] = cj * ci
    return R


def translation_matrix(vector):
    """
    Translation matrix

    Args:
        vector list[float]:     (x, y, z)

    Returns:
        T:      [4, 4]
    """
    M = np.identity(4)
    M[:3, 3] = vector[:3]
    return M


def load_camera_params():
    """
    Get the intrinsic and extrinsic parameters
    Returns:
        Camera extrinsic and intrinsic matrices
    """
    p = {}
    p["roll"] =  0.0
    p["pitch"] =  0.03842560000000292
    p["yaw"] =  -0.009726800000000934

    p["x"] =  1.7
    p["y"] =  0.026239999999999368
    p["z"] =  1.212400000000026

    p["fx"] =  2263.54773399985
    p["fy"] =  2250.3728170599807
    p["u0"] =  1079.0175620000632
    p["v0"] =  515.006600600019

    fx, fy = p['fx'], p['fy']
    u0, v0 = p['u0'], p['v0']

    pitch, roll, yaw = p['pitch'], p['roll'], p['yaw']
    x, y, z = p['x'], p['y'], p['z']

    # Intrinsic
    K = np.array([[fx, 0, u0, 0],
                  [0, fy, v0, 0],
                  [0, 0, 1, 0],
                  [0, 0, 0, 1]])

    # Extrinsic
    R_veh2cam = np.transpose(rotation_from_euler(roll, pitch, yaw))
    T_veh2cam = translation_matrix((-x, -y, -z))

    # Rotate to camera coordinates
    R = np.array([[0., -1., 0., 0.],
                  [0., 0., -1., 0.],
                  [1., 0., 0., 0.],
                  [0., 0., 0., 1.]])

    RT = R @ R_veh2cam @ T_veh2cam
    return RT, K

load_camera_paramsat the end of R The reason for multiplying matrices Vehicle → Camerais because the direction of the coordinate system changes. If you look at it, it is as follows:

Camera X ← World -Y

Camera Y ← World -Z

Camera Z ← World X

image

b. Determine the relationship between BEV images and world coordinates

BEVThe advantage of the image is perspective distortion that even though some information has been removed, world coordinate the information is still contained in the image itself. That is, the BEV each pixel coordinate in the image world coordinate contains location information. For example BEV in the image (in_{BEV}, in_{BEV}) is world coordinate in X, AND, WITH = 0 This means that you can immediately find out what corresponds to.

This relationship BEV depends on how the user designs and creates the image. In the example we will look at world coordinate, X

Design the entire area of ​​the image by determining the direction and maximum value. Additionally, you can determine the overall size of the image by designing how many meters the row-direction spacing in the image actually means, and how many meters the column-direction spacing actually means. For example, if BEVthe image increases or decreases by 1 pixel in the row direction,world coordinate X

Let it increase or decrease by 0.05 (m) in either direction and increase BEV by 1 pixel in the column direction in the image. It can be made to increase or decrease by 0.025 (m) in any direction.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
world_x_max = 50
world_x_min = 7
world_y_max = 10
world_y_min = -10

world_x_interval = 0.05
world_y_interval = 0.025

# Calculate the number of rows and columns in the output image
output_width = int(np.ceil((world_y_max - world_y_min) / world_y_interval))
output_height = int(np.ceil((world_x_max - world_x_min) / world_x_interval))

print("(width, height) :", "(", output_width, ",",  output_height, ")")
# (width, height) : ( 800 , 860 )

The above values ​​are Vehicle based on the coordinate system X 7 to 50 m axially, AND BEV This means that the area from -10 to 10 m in the axial direction will be imaged. BEV The row direction of the image World coordinate is X Since it corresponds to the axis, it means that moving by 1 pixel in the row direction increases or decreases by 0.05 m, and by the same logic, it increases or decreases by 0.025 m in the column direction. When designed like this, BEVthe image size will be width 800 and height 860.

image

It will be easier to understand if you compare the previous explanation to the picture above.

In this way, World coordinate random

Let’s take a look at how the points correspond to the image coordinate system. The example points below are sample points selected to aid understanding

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
points = np.array([
    [world_x_max, world_y_max, 0, 1],
    [world_x_max, world_y_min, 0, 1],
    [world_x_min, world_y_min, 0, 1],
    [world_x_min, world_y_max, 0, 1],
    [10, -3, 0, 1],
    [10, 3, 0, 1],
    [6, 0, 0, 1],
    [7, 0, 0, 1],
    [8, 0, 0, 1],
    [9, 0, 0, 1],
    [10, 0, 0, 1],
    [11, 0, 0, 1],
    [12, 0, 0, 1],
    [13, 0, 0, 1],
    [14, 0, 0, 1],
    [15, 0, 0, 1],
    [16, 0, 0, 1],
    [17, 0, 0, 1],
    [18, 0, 0, 1],
    [19, 0, 0, 1],
    [20, 0, 0, 1],
    [21, 0, 0, 1],
    [22, 0, 0, 1],
    [23, 0, 0, 1],
    [24, 0, 0, 1],
    [25, 0, 0, 1],
], dtype=np.float32)

image_coords = intrinsic @ extrinsic @ points.T
image_coords /= image_coords[2]
uv = image_coords[:2, :]

image

c. Obtain LUT (Look Up Table) between BEV image and Image coordinates

Looking at the contents so far, world coordinate any arbitrary point in ( X , AND , WITH = 0 ) BEVWe confirmed that we can create an image as long as we know which pixel corresponds to the image coordinate system .

image

For example, as shown in the example above, you can find out which pixel of the original image corresponds to a random pixel in an image world coordinate set using the information by using BEV

If you can have this corresponding information for every pixel in every image BEV, you can create an image every time, so obtaining this corresponding information BEVis the key to creating an image. LUT(Look Up Table)Let’s look at how to find , which represents this correspondence .

Using the code below, you can find the person , that satisfies the desired world_x_max, world_x_min, world_y_max, world_y_min, world_x_interval, .world_y_interval LUT map_x map_y BEV(u, v)The coordinate value of the image dst[v][u] = src[ map_y[v][u] ][ map_x[v][u] ]can be obtained by referring to the index in the same way as map_xthe map_y information in .

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
def generate_direct_backward_mapping(
    world_x_min, world_x_max, world_x_interval, 
    world_y_min, world_y_max, world_y_interval, extrinsic, intrinsic):
    
    print("world_x_min : ", world_x_min)
    print("world_x_max : ", world_x_max)
    print("world_x_interval (m) : ", world_x_interval)
    print()
    
    print("world_y_min : ", world_y_min)
    print("world_y_max : ", world_y_max)
    print("world_y_interval (m) : ", world_y_interval)
    
    world_x_coords = np.arange(world_x_max, world_x_min, -world_x_interval)
    world_y_coords = np.arange(world_y_max, world_y_min, -world_y_interval)
    
    output_height = len(world_x_coords)
    output_width = len(world_y_coords)
    
    map_x = np.zeros((output_height, output_width)).astype(np.float32)
    map_y = np.zeros((output_height, output_width)).astype(np.float32)
    
    for i, world_x in enumerate(world_x_coords):
        for j, world_y in enumerate(world_y_coords):
            # world_coord : [world_x, world_y, 0, 1]
            # uv_coord : [u, v, 1]
            
            world_coord = [world_x, world_y, 0, 1]
            camera_coord = extrinsic[:3, :] @ world_coord
            uv_coord = intrinsic[:3, :3] @ camera_coord
            uv_coord /= uv_coord[2]

            # map_x : (H, W)
            # map_y : (H, W)
            # dst[i][j] = src[ map_y[i][j] ][ map_x[i][j] ]
            map_x[i][j] = uv_coord[0]
            map_y[i][j] = uv_coord[1]
            
    return map_x, map_y

map_x, map_y = generate_direct_backward_mapping(world_x_min, world_x_max, world_x_interval, world_y_min, world_y_max, world_y_interval, extrinsic, intrinsic)

# world_x_min :  7
# world_x_max :  50
# world_x_interval (m) :  0.05

# world_y_min :  -10
# world_y_max :  10
# world_y_interval (m) :  0.025

The size of the generated map_x, is the same as the size of the image. Therefore, in the above example , we have , of size (w=800, h=860), so we can apply to all indices

d. Create an image backward by IPM processing it BEV

The map_x , created earlier map_y is the key, and now you just need to create the image backward using this method.

The method refers to a method of obtaining the desired RGB value by accessing the pixels of the image target backwards from the pixels of the image to be created .source

The reason for using this method is that when sending values source ​​from image to image, it is difficult to match values ​​to all pixels , resulting in pixels not matching in the image . For example, as we saw earlier, in areas far from the camera, there may be multiple pixels in the image corresponding to one pixel

image

If you look at the blue area in the picture above, the dots are starting to overlap. This happens more often in remote areas. src → target(BEV) If you send dots one by one, BEV there will be more dots that do not correspond to the image.

Therefore, target → src determining the relationship between pixels to be referenced by direction BEV is a way to fill the entire image, and this method backward mapping is called . In the code below, remapwe will name it .

Below is the simplest remap way. Since map_x, is the value, map_yit is processed to retrieve the closest pixel.float round

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def remap_nearest(src, map_x, map_y):
    src_height = src.shape[0]
    src_width = src.shape[1]
    
    dst_height = map_x.shape[0]
    dst_width = map_x.shape[1]
    dst = np.zeros((dst_height, dst_width, 3)).astype(np.uint8)
    for i in range(dst_height):
        for j in range(dst_width):
            src_y = int(np.round(map_y[i][j]))
            src_x = int(np.round(map_x[i][j]))
            if 0 <= src_y and src_y < src_height and 0 <= src_x and src_x < src_width:
                dst[i][j] = src[src_y, src_x, :]
    return dst 

output_image_nearest = remap_nearest(image, map_x, map_y)
output_image = cv2.remap(image, map_x, map_y, cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT)

mask = (output_image > [0, 0, 0])
output_image = output_image.astype(np.float32)
output_image_nearest = output_image_nearest.astype(np.float32)

print("L1 Loss of opencv remap Vs. custom remap nearest : ", np.mean(np.abs(output_image[mask]-output_image_nearest[mask])))
print("L2 Loss of opencv remap Vs. custom remap nearest : ", np.mean((output_image[mask]-output_image_nearest[mask])**2))

# L1 Loss of opencv remap Vs. custom remap nearest :  0.0
# L2 Loss of opencv remap Vs. custom remap nearest :  0.0

remap_nearest This is round a method of srcaccessing images backward mapping, and for easy use, cv2.remapyou can use functions. Instead, cv2.INTER_NEARESTif you give as an option round, it works the same way.

remap_nearest If you look at the difference between and at the end, cv2.remap you can see that there is no difference. The generated BEV image is as follows.

image

However, this type of round operation makes the reference pixels the same as the distance from the camera increases, resulting in images that appear to have lower resolution, as shown at the top of the image in the picture above artifact.

To improve this problem, round decimal points bilinear interpolation are generally used instead of being processed in operations.

Please refer to the following link. This is a method of using the relationship between round the four surrounding points and the point value to be used, as shown in the following figure, without selecting the value

If you want to use something like this, you can implement it using the code below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
def bilinear_sampler(imgs, pix_coords):
    """
    Construct a new image by bilinear sampling from the input image.
    Args:
        imgs:                   [H, W, C]
        pix_coords:             [h, w, 2]
    :return:
        sampled image           [h, w, c]
    """
    img_h, img_w, img_c = imgs.shape
    pix_h, pix_w, pix_c = pix_coords.shape
    out_shape = (pix_h, pix_w, img_c)

    pix_x, pix_y = np.split(pix_coords, [1], axis=-1)  # [pix_h, pix_w, 1]
    pix_x = pix_x.astype(np.float32)
    pix_y = pix_y.astype(np.float32)

    # Rounding
    pix_x0 = np.floor(pix_x)
    pix_x1 = pix_x0 + 1
    pix_y0 = np.floor(pix_y)
    pix_y1 = pix_y0 + 1

    # Clip within image boundary
    y_max = (img_h - 1)
    x_max = (img_w - 1)
    zero = np.zeros([1])

    pix_x0 = np.clip(pix_x0, zero, x_max)
    pix_y0 = np.clip(pix_y0, zero, y_max)
    pix_x1 = np.clip(pix_x1, zero, x_max)
    pix_y1 = np.clip(pix_y1, zero, y_max)

    # Weights [pix_h, pix_w, 1]
    wt_x0 = pix_x1 - pix_x
    wt_x1 = pix_x - pix_x0
    wt_y0 = pix_y1 - pix_y
    wt_y1 = pix_y - pix_y0

    # indices in the image to sample from
    dim = img_w

    # Apply the lower and upper bound pix coord
    base_y0 = pix_y0 * dim
    base_y1 = pix_y1 * dim

    # 4 corner vertices
    idx00 = (pix_x0 + base_y0).flatten().astype(np.int32)
    idx01 = (pix_x0 + base_y1).astype(np.int32)
    idx10 = (pix_x1 + base_y0).astype(np.int32)
    idx11 = (pix_x1 + base_y1).astype(np.int32)

    # Gather pixels from image using vertices
    imgs_flat = imgs.reshape([-1, img_c]).astype(np.float32)
    im00 = imgs_flat[idx00].reshape(out_shape)
    im01 = imgs_flat[idx01].reshape(out_shape)
    im10 = imgs_flat[idx10].reshape(out_shape)
    im11 = imgs_flat[idx11].reshape(out_shape)

    # Apply weights [pix_h, pix_w, 1]
    w00 = wt_x0 * wt_y0
    w01 = wt_x0 * wt_y1
    w10 = wt_x1 * wt_y0
    w11 = wt_x1 * wt_y1
    output = w00 * im00 + w01 * im01 + w10 * im10 + w11 * im11
    return output

def remap_bilinear(image, map_x, map_y):
    pix_coords = np.concatenate([np.expand_dims(map_x, -1), np.expand_dims(map_y, -1)], axis=-1)
    bilinear_output = bilinear_sampler(image, pix_coords)
    output = np.round(bilinear_output).astype(np.int32)
    return output    

output_image_bilinear = remap_bilinear(image, map_x, map_y)
output_image = cv2.remap(image, map_x, map_y, cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)

mask = (output_image > [0, 0, 0])
output_image = output_image.astype(np.float32)
output_image_bilinear = output_image_bilinear.astype(np.float32)
print("L1 Loss of opencv remap Vs. custom remap bilinear : ", np.mean(np.abs(output_image[mask]-output_image_bilinear[mask])))
print("L2 Loss of opencv remap Vs. custom remap bilinear : ", np.mean((output_image[mask]-output_image_bilinear[mask])**2))

# L1 Loss of opencv remap Vs. custom remap bilinear :  0.045081623
# L2 Loss of opencv remap Vs. custom remap bilinear :  0.66912574

image

bilinear interpolation If you apply as shown in the picture above, artifact you can see that the phenomenon has improved. Therefore, it is recommended to use , which adds operations but artifact disappears .bilinear interpolation

As shown in the result of the above code, bilinear interpolation you can see that the value varies depending on the implementation method. nearest Unlike this case, the results of remap_bilinearand cv2.remapare not completely the same. (The level of difference is negligible.)

This post is licensed under CC BY 4.0 by the author.