This is a follow-up and a “get what I’ve done down” to Listening to Skype Voicemail .dat files.
I got back on the “Lets get these 70+ Skype voicemails listened to” bandwagon…
Read MoreThis is a follow-up and a “get what I’ve done down” to Listening to Skype Voicemail .dat files.
I got back on the “Lets get these 70+ Skype voicemails listened to” bandwagon…
Read MoreI have a D-Link DCS-8526LH Camera in my living room pointed to my front door. I originally purchased it a long, long time ago to keep an eye on kids during school while at home during COVID. It’s aged well, and it’s a nice thing. It lets me know if someone’s coming into my front door whether I want to know about it or not, and that’s all I need to know.
Read MoreHad to compare two files at work today. Actually, I had to compare one file to a series of files to see what data exists in both of them. This technically comes down to a LEFT JOIN where we only want left column data when it exists in the right column.
So, in writing a script in PHP it comes down to:
<?php
ini_set('MEMORY_LIMIT', '256M');
if (!file_exists($argv[1])) { die('file ' . $argv[1] . ' not found'); }
if (!file_exists($argv[2])) { die('file ' . $argv[2] . ' not found'); }
$fp = fopen($argv[1], 'rt');
$lines = [];
do {
$line = trim(fgets($fp));
if (strlen($line) > 0) {
$lines[] = $line;
}
} while (!feof($fp));
fclose($fp);
$fp = fopen($argv[2], 'rt');
do {
$line = trim(fgets($fp));
if (strlen($line) > 0) {
if (in_array($line, $lines)) {
echo "$line\n";
}
}
} while (!feof($fp));
fclose($fp);
This script, albeit working like a charm, takes a while with large amounts of records.
After some googling this script isn’t really necessary if you use grep correctly. You also gain the speed of an executable in one fell swoop.
$ grep -Fxf [file1] [file2]
Output is exactly the same.
In writing “quad-quad”, which is a set of four 4-letter speak-able words that can be used as a user-friendly “bookmark” into easily finding a record, I was writing a “quick” program to extract the contents of wikidatawiki-20220820-pages-articles-multistream.xml (a wikipedia dump) and came into this large delay in the following loop:
$alphas = 'qwertyuiopasdfghjklzxcvbnm ';
$newline = '';
for ($x = 0; $x < strlen($line); $x++) {
$c = substr($line, $x, 1);
if (strpos($alphas, $c) !== false) {
$newline = $newline . $c;
else {
$newline = $newline . ' ';
}
}
The loops main purpose is to sanitize any non-letter data by replacing unknown characters with a space for later processing. The end result would be words that I could filter down to 4-character words and tally them up.
When the program read a line around 1mb in length it would “hang” for a bit as it chewed through the data. In a nutshell 25,100,655 bytes of data would take 24m36s. It was time to optimize.
Replacing the previous with the following regex performance was increased immensely.
$newline = preg_replace('/[^a-z]/', ' ', $line);
The same amount of data took 1.892s.
Lesson: If you don’t know regexes, learn regexes.
Came across this and felt the need to write the “Power of 10” rules…
Read More