Wednesday, September 13, 2017

How to make MP4 video files smaller

The important command for this is:

ffmpeg -nostdin -i "$ORIGINAL_FILE" -vcodec libx264 -crf 26 "$OUTPUT_FILE"

The crf factor can be higher, then the file size will be even smaller.

It is possible to process more files at once.

Let's find files that have bitrate more than 3000 kb/s (which means we have probably not processed them yet):

find . -name "*.mp4" | while read F; do if [ `ffprobe "$F" 2>&1 | sed -n 's/.*bitrate:[^0-9]*\([0-9]*\).*/\1/p'` -gt 3000 ]; then echo "$F"; fi; done | tee tmp_mp4_list

If we want to see some details like the bitrate and file size, we can add this info for us:

find . -name "*.mp4" | while read F; do BTR=`ffprobe "$F" 2>&1 | sed -n 's/.*bitrate:[^0-9]*\([0-9]*\).*/\1/p'`; if [ "$BTR" -gt 3000 ]; then echo "$BTR `ls -lh "$F"`"; fi; done | tee tmp_mp4_list_details

We can pass the output from the first find command to ffmpeg:

cat tmp_mp4_list | while read F; do F_ORIG=`echo "$F" | sed 's/\(\.[^.]*\)$/_orig\1/'`; F_NEW=`echo "$F" | sed 's/.mp4$/.MP4/'`; mv -i "$F" "$F_ORIG" && ffmpeg -nostdin -i "$F_ORIG" -vcodec libx264 -crf 26 "$F_NEW"; if [ ! $? -eq 0 ]; then echo "$F" "$?" >> tmp_errors; fi; done

It will rename the original file to file_orig.mp4. The new one will have an uppercase extension .MP4 so that we can recognize we dealt with it already. We do not add extra characters to not make the filename longer in order to save space on a mobile display. If an error occurs, the filename will be mentioned in the 'tmp_errors' file.

Wednesday, May 17, 2017

Recover deleted photos from SD card

My friend came back from Africa with all photos on his SD card deleted. He knew that it is technically possible to recover them if they are not overwritten by new photos so he stopped using the SD card until he gave the card to me.

I used the program photorec to recover the photos. He used them happily for presentations.

With modern phones like Android, if the photos are stored in the internal storage instead of the SD card, I do not know an easy way to recover them (unless the phone is rooted).

Thursday, February 9, 2017

Hard disk full on Ubuntu

If you encounter low disk space on Ubuntu (eventually on other Linux flavour), a useful program to find size of directories is baobab.

You might see that /lib/modules contains gigabytes of files related to old kernel versions. How to remove them? The answer is here:

apt install --auto-remove

Tuesday, January 31, 2017

Migrate a WordPress blog into HubSpot

There is a tutorial on how to migrate WordPress blog posts into hubspot. If anything more complicated is needed, e.g. to deal with language variants, here are some commands that will help to automate it. I am also offering the migration as a service.

The situation was that the original blog had language variants in the form of (without any language specified) for English and for other languages (in this case, "ru" for Russian).

Therefore, we process English blog items separately:

{ xmlstarlet ed -d '//item[not(contains(link,""))]' all.xml; echo ""; } > all-en.xml

Other languages are processed automatically with the help of a loop:

for L in cs de es fr it ja pl pt-br ru tr uk ; do export F=all-$L.xml;  { xmlstarlet ed  -d '//item[not(contains(link,"'$L'/"))]' all.xml; echo ""; } > "$F"; sed -r -i "s:(<link.*$L/:\1:g" $F; done

There are redirects to be set up in the HubSpot settings so that the old URLs are still accessible and they will point to existing articles (new locations). This helps to maintain ranking in search engines and does not break links from other sites.

# english without a cycle
L="" perl -ne 'if(/^http:\/\/*),http:\/\/*)$/ && not ($1 eq $2)) { print "$1,$2\n"}' > redirects-languages.txt

# other languages
for L in de it fr ru pl pt-br cs es; do
perl -ne 'if(/^http:\/\/*),http:\/\/*)$/ && not ("/'$L'$1" eq $2)) { print "/'$L'$1,$2\n"}' >> redirects-languages.txt

One may have to deal with CDATA:

CDATA tag replace in ViM:

:%s#\(<script.*\)// <!\[CDATA\[#\1#
:%s#// \]\]\]\]><!\[CDATA\[></script>#</script>#

When uploading new redirects (which is possible to do in bulk), we deleted the old redirects first. It is possible to automate it with an iMacros script:

TAG POS=1 TYPE=SPAN ATTR=CLASS:dropdown-targetsettings-icon&&TXT:
TAG POS=1 TYPE=A ATTR=ID:hs-fancybox-ok

With these commands, it was possible to migrate thousands of blog posts from WordPress into HubSpot.