I want to decode h265 file or numpy array data generated by depthai camera using ffmpeg.
I tried to decode the h265 file or numpy array created by the depthai camera using ffmpeg, but it failed. If there is a method recommended by depthai or a method for decoding with ffmpeg, please let me know.
First, I will write the method I tried.
It was created by referring to the depthai example.
depthai -> extract h265 file
#!/usr/bin/env python3
import depthai as dai
# Create pipeline
pipeline = dai.Pipeline()
# Define sources and output
camRgb = pipeline.create(dai.node.ColorCamera)
videoEnc = pipeline.create(dai.node.VideoEncoder)
xout = pipeline.create(dai.node.XLinkOut)
xout.setStreamName('h265')
# Properties
camRgb.setBoardSocket(dai.CameraBoardSocket.CAM_A)
camRgb.setResolution(dai.ColorCameraProperties.SensorResolution.THE_4_K)
videoEnc.setDefaultProfilePreset(30, dai.VideoEncoderProperties.Profile.H265_MAIN)
# Linking
camRgb.video.link(videoEnc.input)
videoEnc.bitstream.link(xout.input)
# Connect to device and start pipeline
with dai.Device(pipeline) as device:
# Output queue will be used to get the encoded data from the output defined above
q = device.getOutputQueue(name="h265", maxSize=30, blocking=True)
count = 0
# The .h265 file is a raw stream file (not playable yet)
with open('video.h265', 'wb') as videoFile:
print("Press Ctrl+C to stop encoding...")
try:
while True:
h265Packet = q.get() # Blocking call, will wait until a new data has arrived
frame = h265Packet.getData().tofile(videoFile)
# save hevc files
with open(f'output{count}.hevc', 'wb') as video_file:
video_file.write(frame)
count+=1
except KeyboardInterrupt:
# Keyboard interrupt (Ctrl + C) detected
pass
print("To view the encoded data, convert the stream file (.h265) into a video file (.mp4) using a command below:")
print("ffmpeg -framerate 30 -i video.h265 -c copy video.mp4")
- save h265 file
# save hevcfiles
with open(f'output{count}.hevc', 'wb') as video_file:
video_file.write(frame)
count+=1
python ffmpeg
However, only i-frames were decoded.
import numpy as np
import subprocess as sp
import time
import sys
a = []
for i in range(177):
input_file = f'images/output{i}.hevc'
ffmpeg_cmd = [
'ffmpeg',
'-y',
'-i', input_file,
'-c:v', 'hevc',
'-pix_fmt', 'yuv420p',
'-f', 'rawvideo',
'-analyzeduration', '100M',
'-probesize', '100M',
'-'
]
process = sp.Popen(ffmpeg_cmd, stdout=sp.PIPE)
raw_data, _ = process.communicate()
if len(raw_data) > 0:
a.append(len(raw_data))
print(a)
decode result
- decode i frame
ffmpeg version n5.0.2 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --disable-static --enable-gpl --enable-libx265 --enable-shared
libavutil 57. 17.100 / 57. 17.100
libavcodec 59. 18.100 / 59. 18.100
libavformat 59. 16.100 / 59. 16.100
libavdevice 59. 4.100 / 59. 4.100
libavfilter 8. 24.100 / 8. 24.100
libswscale 6. 4.100 / 6. 4.100
libswresample 4. 3.100 / 4. 3.100
libpostproc 56. 3.100 / 56. 3.100
Input #0, hevc, from 'images/output150.hevc':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: hevc (Main), yuv420p(tv, bt470bg), 960x520, 30 tbr, 1200k tbn
Stream mapping:
Stream #0:0 -> #0:0 (hevc (native) -> hevc (libx265))
Press [q] to stop, [?] for help
x265 [info]: HEVC encoder version 3.2.1+1-b5c86a64bbbe
x265 [info]: build info [Linux][GCC 9.3.0][64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
x265 [info]: Main profile, Level-3 (Main tier)
x265 [info]: Thread pool created using 64 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 5 / wpp(9 rows)
x265 [warning]: Source height < 720p; disabling lookahead-slices
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 1 inter / 1 intra
x265 [info]: ME / range / subpel / merge : hex / 57 / 2 / 3
x265 [info]: Keyframe min / max / scenecut / bias: 25 / 250 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 20 / 4 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 0
x265 [info]: References / ref-limit cu / depth : 3 / off / on
x265 [info]: AQ: mode / str / qg-size / cu-tree : 2 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : CRF-28.0 / 0.60
x265 [info]: tools: rd=3 psy-rd=2.00 early-skip rskip signhide tmvp b-intra
x265 [info]: tools: strong-intra-smoothing deblock sao
Output #0, rawvideo, to 'pipe:':
Metadata:
encoder : Lavf59.16.100
Stream #0:0: Video: hevc, yuv420p(tv, bt470bg, progressive), 960x520, q=2-31, 30 fps, 30 tbn
Metadata:
encoder : Lavc59.18.100 libx265
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame= 1 fps=0.0 q=31.6 Lsize= 8kB time=00:00:00.06 bitrate= 956.8kbits/s speed=0.606x
video:8kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
x265 [info]: frame I: 1, Avg QP:31.56 kb/s: 1401.60
x265 [info]: consecutive B-frames: 100.0% 0.0% 0.0% 0.0% 0.0%
encoded 1 frames in 0.07s (14.47 fps), 1401.60 kb/s, Avg QP:31.56
- decode p, b frames
ffmpeg version n5.0.2 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --disable-static --enable-gpl --enable-libx265 --enable-shared
libavutil 57. 17.100 / 57. 17.100
libavcodec 59. 18.100 / 59. 18.100
libavformat 59. 16.100 / 59. 16.100
libavdevice 59. 4.100 / 59. 4.100
libavfilter 8. 24.100 / 8. 24.100
libswscale 6. 4.100 / 6. 4.100
libswresample 4. 3.100 / 4. 3.100
libpostproc 56. 3.100 / 56. 3.100
[hevc @ 0x56192fa4eac0] Format hevc detected only with low score of 1, misdetection possible!
[hevc @ 0x56192fa4fe80] PPS id out of range: 0
Last message repeated 1 times
[hevc @ 0x56192fa4fe80] Error parsing NAL unit #1.
[hevc @ 0x56192fa4eac0] Could not find codec parameters for stream 0 (Video: hevc, none): unspecified size
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
Input #0, hevc, from 'images/output174.hevc':
Duration: N/A, bitrate: N/A
Stream #0:0: Video: hevc, none, 25 tbr, 1200k tbn
Stream mapping:
Stream #0:0 -> #0:0 (hevc (native) -> hevc (libx265))
Press [q] to stop, [?] for help
[hevc @ 0x56192fa55b40] PPS id out of range: 0
[hevc @ 0x56192fa55b40] Error parsing NAL unit #1.
Error while decoding stream #0:0: Invalid data found when processing input
Cannot determine format of input stream 0:0 after EOF
Error marking filters as finished
etc…
Strangely, when decoded using python av, it was decoded normally.
I would like to be able to decode it using ffmpeg if possible. To use nvidia cuda!