hadoop + ffmpeg 雲轉碼
hadoop + ffmpeg 分散式轉碼系統實踐
一、分割影片:
mkvmerge --split size:32m ./heihu01.mp4 -o ./heihu01.%05d.mp4
mkvmerge --split size:32m ./heihu01.mp4 -o ./heihu01.%05d.mp4
二、hdfs中建立存放分割後影片的目錄
八、使用hadoop streaming執行轉碼
1、編寫指令碼
2、本地執行測試
hadoop fs -mkdir movies
三、上傳分割後的影片
for i in `ls heihu01.*.mp4`; do hadoop fs -put $i movies/; done
for i in `ls heihu01.*.mp4`; do hadoop fs -put $i movies/; done
四、建立mapper輸入資料檔案路徑
hadoop fs -mkdir movies_input
hadoop fs -mkdir movies_input
五、生成mapper資料檔案,並上傳
cat > mapper_input.sh<
#!/bin/bash
pwd=\`pwd\`
tmp_file='movies_tmp.txt'
num=2 #TaskTracker數量
true > \${tmp_file}
hadoop fs -rm movies_input/movies_*
for i in \`ls *.[0-9][0-9][0-9][0-9][0-9].*\`;do echo movies/\$i >> \${tmp_file};done
count="\$(wc -l \${tmp_file}|cut -d' ' -f1)"
if [ \$((\$count%\$num)) -eq 0 ];then
rows="\$((\$count/\$num))"
else
rows="\$((\$count/\$num+1))"
fi
split -l \$rows \${tmp_file} movies_
hadoop fs -put movies_[a-z0-9][a-z0-9] movies_input
EOF
chmod +x mapper_input.sh
./mapper_input.sh
六、建立轉換後影片的上傳路徑
hadoop fs -mkdir movies_put
hadoop fs -mkdir movies_put
七、檢查Hadoop Streaming的執行身份與工作目錄
1、編寫指令碼
1、編寫指令碼
cat > test_mapper.sh << EOF
#!/bin/bash
set -x
id="`whoami`"
mkdir -p /tmp/\$id
host=\`hostname\`
pwd=\`pwd\`
uid=\`whoami\`
put_dir='movies_put'
while read line; do
input=\$line
filename=\`basename \$input\`
echo "\$uid@\$host:\$pwd> hadoop fs -get \$input /tmp/\$id/\$filename"
echo "\$uid@\$host:\$pwd> ffmpeg -y -i /tmp/\$id/\$filename -s qcif -r 20 -b 200k -vcodec mpeg4 -ab 64k -ac 2 -ar 22050 -acodec libfaac output-\$filename.3gp"
echo "\$uid@\$host:\$pwd> hadoop fs -put output-\$filename \${put_dir}"
done
rm -rf /tmp/\$id
EOF
chmod a+x test_mapper.sh
#!/bin/bash
set -x
id="`whoami`"
mkdir -p /tmp/\$id
host=\`hostname\`
pwd=\`pwd\`
uid=\`whoami\`
put_dir='movies_put'
while read line; do
input=\$line
filename=\`basename \$input\`
echo "\$uid@\$host:\$pwd> hadoop fs -get \$input /tmp/\$id/\$filename"
echo "\$uid@\$host:\$pwd> ffmpeg -y -i /tmp/\$id/\$filename -s qcif -r 20 -b 200k -vcodec mpeg4 -ab 64k -ac 2 -ar 22050 -acodec libfaac output-\$filename.3gp"
echo "\$uid@\$host:\$pwd> hadoop fs -put output-\$filename \${put_dir}"
done
rm -rf /tmp/\$id
EOF
chmod a+x test_mapper.sh
2、本地執行測試
cat movies_aa |./test_mapper.sh
cat movies_aa |./test_mapper.sh
3、hadoop streaming執行測試
hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-streaming-1.0.2.jar -input movies_input -output movies_output -mapper test_mapper.sh -file test_mapper.sh
hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-streaming-1.0.2.jar -input movies_input -output movies_output -mapper test_mapper.sh -file test_mapper.sh
4、檢視hadoop streaming執行結果
hadoop fs -cat /user/$(whoami)/movies_output/part-00000 | head
hadoop fs -cat /user/$(whoami)/movies_output/part-00000 | head
5、刪除測試輸出
hadoop fs -rmr movies_output #刪除測試hadoop streaming的輸出
八、使用hadoop streaming執行轉碼
1、編寫指令碼
cat > mapper.sh << EOF
#!/bin/bash
id="hduser"
mkdir -p /tmp/\$id
host=\`hostname\`
pwd=\`pwd\`
uid=\`whoami\`
put_dir='movies_put'
cd "/tmp/\$id"
true > a
while read line; do
input=\$line
filename=\`basename \$input\`
echo "\$uid@\$host> hadoop fs -get \$input /tmp/\$id/\$filename"
/usr/local/hadoop/bin/hadoop fs -get \$input /tmp/\$id/\$filename 2>&1
echo "\$uid@\$host> ffmpeg -y -i /tmp/\$id/\$filename -s qcif -r 20 -b 200k -vcodec mpeg4 -ab 64k -ac 2 -ar 22050 -acodec libfaac output-\$filename.3gp"
ffmpeg -y -i /tmp/\$id/\$filename -s 320*240 -r 20 -b 200k -vcodec mpeg4 -ab 64k -ac 2 -ar 22050 -qscale 5 -acodec libfaac output-\$filename.3gp < a 2>&1
/usr/local/hadoop/bin/hadoop fs -put output-\$filename.3gp \${put_dir} 2>&1
echo "\$uid@\$host> hadoop fs -chown \$id \${put_dir}/output-\$filename.3gp"
/usr/local/hadoop/bin/hadoop fs -chown \$id \${put_dir}/output-\$filename.3gp 2>&1
done
rm -f a
rm -rf /tmp/\$id
EOF
chmod a+x mapper.sh
#!/bin/bash
id="hduser"
mkdir -p /tmp/\$id
host=\`hostname\`
pwd=\`pwd\`
uid=\`whoami\`
put_dir='movies_put'
cd "/tmp/\$id"
true > a
while read line; do
input=\$line
filename=\`basename \$input\`
echo "\$uid@\$host> hadoop fs -get \$input /tmp/\$id/\$filename"
/usr/local/hadoop/bin/hadoop fs -get \$input /tmp/\$id/\$filename 2>&1
echo "\$uid@\$host> ffmpeg -y -i /tmp/\$id/\$filename -s qcif -r 20 -b 200k -vcodec mpeg4 -ab 64k -ac 2 -ar 22050 -acodec libfaac output-\$filename.3gp"
ffmpeg -y -i /tmp/\$id/\$filename -s 320*240 -r 20 -b 200k -vcodec mpeg4 -ab 64k -ac 2 -ar 22050 -qscale 5 -acodec libfaac output-\$filename.3gp < a 2>&1
/usr/local/hadoop/bin/hadoop fs -put output-\$filename.3gp \${put_dir} 2>&1
echo "\$uid@\$host> hadoop fs -chown \$id \${put_dir}/output-\$filename.3gp"
/usr/local/hadoop/bin/hadoop fs -chown \$id \${put_dir}/output-\$filename.3gp 2>&1
done
rm -f a
rm -rf /tmp/\$id
EOF
chmod a+x mapper.sh
2、本地執行測試
cat movies_aa | ./mapper.sh
hadoop fs -rm movies_put/* #刪除本地執行的遺留檔案
3、使用hadoop執行指令碼
hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-streaming-1.0.2.jar -input movies_input -output movies_output -mapper mapper.sh -file mapper.sh
4、驗證結果
hadoop fs -cat movies_output/part-00000 |head
hadoop fs -ls movies_put
Found 6 items
-rw-r--r-- 3 hduser supergroup 19584280 2012-05-28 13:53 /user/hduser/movies_put/output-heihu01.00001.mp4.3gp
-rw-r--r-- 3 hduser supergroup 14872878 2012-05-28 13:54 /user/hduser/movies_put/output-heihu01.00002.mp4.3gp
-rw-r--r-- 3 hduser supergroup 12052800 2012-05-28 13:55 /user/hduser/movies_put/output-heihu01.00003.mp4.3gp
-rw-r--r-- 3 hduser supergroup 11174014 2012-05-28 13:53 /user/hduser/movies_put/output-heihu01.00004.mp4.3gp
-rw-r--r-- 3 hduser supergroup 15713836 2012-05-28 13:55 /user/hduser/movies_put/output-heihu01.00005.mp4.3gp
-rw-r--r-- 3 hduser supergroup 13084511 2012-05-28 13:56 /user/hduser/movies_put/output-heihu01.00006.mp4.3gp
-rw-r--r-- 3 hduser supergroup 19584280 2012-05-28 13:53 /user/hduser/movies_put/output-heihu01.00001.mp4.3gp
-rw-r--r-- 3 hduser supergroup 14872878 2012-05-28 13:54 /user/hduser/movies_put/output-heihu01.00002.mp4.3gp
-rw-r--r-- 3 hduser supergroup 12052800 2012-05-28 13:55 /user/hduser/movies_put/output-heihu01.00003.mp4.3gp
-rw-r--r-- 3 hduser supergroup 11174014 2012-05-28 13:53 /user/hduser/movies_put/output-heihu01.00004.mp4.3gp
-rw-r--r-- 3 hduser supergroup 15713836 2012-05-28 13:55 /user/hduser/movies_put/output-heihu01.00005.mp4.3gp
-rw-r--r-- 3 hduser supergroup 13084511 2012-05-28 13:56 /user/hduser/movies_put/output-heihu01.00006.mp4.3gp
5、reduce合併影片
cat >reduce.sh <
#!/bin/bash
tmp_file="movies_tmp.txt"
id="hduser"
pwd=\`pwd\`
pwd=\`pwd\`
dir="/tmp/\${id}_merger"
mkdir \$dir
cd $dir
true > \${tmp_file}
hadoop fs -ls movies_put|awk '{print \$8}'|sed '/^$/d' >> \${tmp_file}
unset m
for i in \`cat \${tmp_file}\`
do
hadoop fs -get \$i \$dir
filename=\`basename \$i\`
if [ ! -z \$m ];then
filename="+\$filename"
fi
echo \$filename >> \$dir/files.txt
m=\$((m+1))
done
mkvmerge -o \$dir/output.3gp \`cat \$dir/files.txt\`
hadoop fs -put \$dir/output.3gp movies_put/
rm -rf \$dir
EOF
chmod +x reduce.sh
6、本地執行測試
./reduce.sh
7、使用hadoop執行指令碼
hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-streaming-1.0.2.jar -input movies_input -output movies_output -mapper mapper.sh -reducer reduce.sh -file reduce.sh -file mapper.sh
8、驗證結果
hadoop fs -ls movies_put
Found 7 items
-rw-r--r-- 3 hduser supergroup 19584280 2012-05-29 14:15 /user/hduser/movies_put/output-heihu01.00001.mp4.3gp
-rw-r--r-- 3 hduser supergroup 14872878 2012-05-29 14:16 /user/hduser/movies_put/output-heihu01.00002.mp4.3gp
-rw-r--r-- 3 hduser supergroup 12052800 2012-05-29 14:17 /user/hduser/movies_put/output-heihu01.00003.mp4.3gp
-rw-r--r-- 3 hduser supergroup 11174014 2012-05-29 14:15 /user/hduser/movies_put/output-heihu01.00004.mp4.3gp
-rw-r--r-- 3 hduser supergroup 15713836 2012-05-29 14:16 /user/hduser/movies_put/output-heihu01.00005.mp4.3gp
-rw-r--r-- 3 hduser supergroup 13084511 2012-05-29 14:17 /user/hduser/movies_put/output-heihu01.00006.mp4.3gp
-rw-r--r-- 3 hduser supergroup 86175913 2012-05-29 14:17 /user/hduser/movies_put/output.3gp
-rw-r--r-- 3 hduser supergroup 19584280 2012-05-29 14:15 /user/hduser/movies_put/output-heihu01.00001.mp4.3gp
-rw-r--r-- 3 hduser supergroup 14872878 2012-05-29 14:16 /user/hduser/movies_put/output-heihu01.00002.mp4.3gp
-rw-r--r-- 3 hduser supergroup 12052800 2012-05-29 14:17 /user/hduser/movies_put/output-heihu01.00003.mp4.3gp
-rw-r--r-- 3 hduser supergroup 11174014 2012-05-29 14:15 /user/hduser/movies_put/output-heihu01.00004.mp4.3gp
-rw-r--r-- 3 hduser supergroup 15713836 2012-05-29 14:16 /user/hduser/movies_put/output-heihu01.00005.mp4.3gp
-rw-r--r-- 3 hduser supergroup 13084511 2012-05-29 14:17 /user/hduser/movies_put/output-heihu01.00006.mp4.3gp
-rw-r--r-- 3 hduser supergroup 86175913 2012-05-29 14:17 /user/hduser/movies_put/output.3gp
hadoop fs -cat movies_output/part-00000
附:
hadoop streaming 除錯
hadoop 的output 中只記錄正確輸出,因此除錯錯誤需要將命令的輸出重定向到正確輸出
即在命令後加"2>&1",如:
mkvmerge -o $dir/output.3gp `cat $dir/files.txt` 2>&1
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29754888/viewspace-1291776/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- ffmpeg入門到實戰-ffmpeg是怎麼轉碼的?
- python+ffmpeg視訊轉碼轉格式Python
- ffmpeg音訊編碼之pcm轉碼aac音訊
- iOS整合FFmpeg及視訊格式轉碼iOS
- Ffmpeg分散式影片轉碼問題總結分散式
- ffmpeg+nginx 實現拉流轉碼播放Nginx
- FFmpeg程式碼實現視訊轉jpg圖片
- Linux 安裝ffmpeg 實現音訊轉碼Linux音訊
- FFmpeg命令影片音訊轉碼引數詳解音訊
- FFmpeg轉碼音影片時間戳設定分析時間戳
- FFmpeg學習之一(FFmpeg原始碼編譯)原始碼編譯
- iOS開發 iOS整合FFmpeg及視訊格式轉碼iOS
- FFmpeg開發筆記(五):ffmpeg解碼的基本流程詳解(ffmpeg3新解碼api)筆記API
- JavaCV FFmpeg AAC編碼Java
- FFmpeg音訊解碼音訊
- ffmpeg解碼基本流程
- ffmpeg批次mov轉換mp4格式指令碼指令碼
- FFmpeg libswscale原始碼分析2-轉碼命令列與濾鏡圖原始碼命令列
- MediaCodec解碼FFmpeg AvPacket
- ffmpeg解碼音訊流音訊
- ffmpeg軟解碼遇到的坑
- 關於FFMPEG的解碼模型模型
- php中使用 ffmpeg(部分程式碼)PHP
- ffmpeg+nginx將rtsp轉為rtmpNginx
- Nginx+FFmpeg實現RTSP轉RTMPNginx
- FFmpeg開發筆記(六)如何訪問Github下載FFmpeg原始碼筆記Github原始碼
- 滴滴雲部署Hadoop3.1.1Hadoop
- 使用hadoop打造自己的雲Hadoop
- 3個重點,20個函式分析,淺析FFmpeg轉碼過程函式
- FFmpeg程式碼實現視訊剪下
- JavaCV FFmpeg H264編碼Java
- NDK開發——FFmpeg視訊解碼
- FFmpeg和avconv編解碼工具
- FFmpeg開發筆記(九):ffmpeg解碼rtsp流並使用SDL同步播放筆記
- 編譯FFMPEG原始碼的指令碼編寫案例編譯原始碼指令碼
- 【FFmpeg】Windows下FFmpeg編譯Windows編譯
- 【FFmpeg】Windows下FFmpeg除錯Windows除錯
- 【FFmpeg】FFmpeg常用基本命令