This week at work I was faced with a bug that manifested as Windows XP clients
being slower to access parts of a web interface than Windows 7 or 10. This was
strange because in Wireshark the requests looked basically the same.
It turned out that the embedded system hosting the web interface was rejecting
Ethernet frames larger than 1500 bytes. This was most likely because of a
misinterpretation of the MTU as referring to the frame size (at layer 2) instead
of the payload size (at layer 3).
The strangest part about this issue was just how consistent the effect was
across all browsers tested given a version of Windows. Windows XP took nearly
22 seconds to complete a request that took just over 2 seconds in Windows 7.
The core of this behavior ended up being a side-effect of the TCP retransmission
timeout.
The default retransmission timeout for Windows XP is stored in the
TCPInitialRTT
registry value for a given network adapter. The default
value when the registry value does not exist is
3000ms. This aligned with the
observation under XP that the first retransmission occurred after 3
seconds with subsequent retransmissions occurring at 6 and 12 seconds. The
first retransmission was sent unfragmented while the second and third
retransmissions were a fragmented version of the original frame, which was
ultimately accepted. The fragmented payloads were at most 576 bytes which
seemed like an interesting size but I did not investigate.
Windows 7 on the other hand appears to retry after 300 milliseconds, with
subsequent retries at 600ms and 1200ms. But the same basic behavior was
followed: the first retry was the full payload while the subsequent retries
were fragmented into at most 576 byte payloads. The key difference appearing
to be the retransmission timeout. This masked the underlying issue for Windows
7 clients.
While I did not search extensively I did not find an explanation for either the
doubling back-off of the retransmissions or the 576 byte payload size.
The rather quick fix was to change the Linux driver to accept frames up to the
actual size that could be handled. And introducing an error message if a frame
is rejected for length reasons.
A friend asked several people on irc about NameSilo's API and dynamic DNS
entries. He found
a PowerShell script
to update a subdomain with the current IP address of the system running the
script. The subdomain detail was the crux of the question: how to get it to
update a "naked" domain. Several of us read through
the API reference but the
dnsUpdateRecord
function didn't explain how to update the base domain's A record.
It turned out that simply leaving off the rrhost
parameter was sufficient to
get the job done.
After we were done iterating on it, we had a PowerShell function to update any
record. Including enough intelligence to handle the base domain case. I don't
believe non-A records were tested, but it met the need of updating the base
domain and a sub-domain or two from a scheduled task.
# NameSilo API Dynamic DNS
# Variables
$APIkey = ""
$domain = ""
function NameSilo-dnsUpdateRecord {
param ([string]$APIKey, [string]$Domain, [string]$Record, [string]$Type)
# Retrieve the DNS entries in the domain.
$listdomains = Invoke-RestMethod -Uri "https://www.namesilo.com/api/dnsListRecords?version=1&type=xml&key=$APIkey&domain=$domain"
$Records = $listdomains.namesilo.reply.resource_record | where { $_.type -eq $Type }
$UpdateRecord = $null
$IsNaked = $False
foreach ($r in $Records ) {
if ([string]::IsNullOrEmpty($Record) -and $r.host -eq $Domain) {
$UpdateRecord = $r
$IsNaked = $True
break
} elseif ($r.host -eq "$($Record).$($Domain)") {
$UpdateRecord = $r
break
}
}
if ($UpdateRecord -eq $null) {
echo "Error: Could not find requested record: $($Record).$($Domain)"
Exit
}
$CurrentIP = $listdomains.namesilo.request.ip
$RecordIP = $UpdateRecord.value
$RecordID = $UpdateRecord.record_id
# Only update the record if necessary.
if ($CurrentIP -ne $RecordIP){
$url = "https://www.namesilo.com/api/dnsUpdateRecord?version=1&type=xml&key=$APIkey&domain=$Domain&rrid=$RecordID"
if ($IsNaked -eq $False) {
$url += "&rrhost=$record"
}
$url += "&rrvalue=$CurrentIP&rrttl=3600"
$update = Invoke-RestMethod -Uri $url
} else {
echo "IP Address has not changed."
}
}
# Invocations:
NameSilo-dnsUpdateRecord -APIKey $APIkey -Domain $domain -Record "" -Type "A"
NameSilo-dnsUpdateRecord -APIKey $APIkey -Domain $domain -Record "*" -Type "A"
NameSilo-dnsUpdateRecord -APIKey $APIkey -Domain $domain -Record "test" -Type "A"
Since I don't do hardly anything in PowerShell, aside from trying to use it more
than cmd.exe
on Windows because it is a resizable window, I did a little
more reading after this was written and concluded that it is not likely
representative of PowerShell best practices.
But it is posted here just in case it might be useful to someone.
In playing with Bitcoin, in this case specifically with
the bitcoin-qt client, I found myself
wanting to more granularly control which of my coins I spent. This probably
isn't something most people care about, or maybe even solve by using multiple
wallets, but I thought that it would be nice to choose which addresses were used
for transactions. I found that I was not
alone in that desire. But sadly,
the patch has yet to be merged despite going through a number of iterations.
Enter contrib/spendfrom/spendfrom.py
. This Python script purported to solve
the problem to some extent. But it wasn't quite as easy as I had hoped to get
working. There is a
README.md
that highlights a dependency on "jsonrpc." That seemed easy enough, since I run
Windows on this particular machine, I tried using
C:\Python27\Scripts\easy_install.exe jsonrpc
which indeed installed
jsonrpc just not the one linked in
the documentation, which I
overlooked.
Once I got the right jsonrpc
checked out from Bazaar, and copied to my
site-packages
directory, I thought I was good to go. However, I ran into a
problem that spendfrom.py
tests for:
def check_json_precision():
"""Make sure json library being used does not lose precision converting BTC values"""
n = Decimal("20000000.00000003")
satoshis = int(json.loads(json.dumps(float(n)))*1.0e8)
if satoshis != 2000000000000003:
raise RuntimeError("JSON encode/decode loses precision")
So I started digging, and found that the json
object that jsonrpc
comes
with did some serialization by using
unicode()
that showed a loss of precision with the given value. This was pretty easy to
verify, and I pushed a
change
to a github hosted version of the json-rpc.org bzr repository, thanks to
git-remote-bzr
. The change inserts a call to
repr()
before passing that string to unicode()
.
Finally, to get everything to work, I replaced all of the uses of Decimal()
in the spendfrom.py
script to just use float()
and have not seen any
issues. Granted I have not done a huge number of transactions, nor have I done
any that were more than four decimal places. Hopefully I didn't introduce some
insidious bug.
I was presented with the task of extracting the plain text from some XML
formatted closed captions. I was in a "quick and dirty" problem solving mood as
opposed, so clearly regular expressions were going to be involved. As such, I
started out with:
sed -r 's/<\/?[^>]+>/\r\n/g' data.xml | grep -v '^$' > data.txt
Since this was XML, of course there were some entities. And to make matters
worse, there were not only the XML named entities (apos, gt, lt, etc.) but there
were also hex encoded entities for things like music notes. Because music notes
are very commonly used in closed captions to tell the viewer that music is
playing. This is one of the big differences between closed captions and
subtitles.
My first thought was that Python should be able to help me solve this problem.
It's a "web friendly" language. People must do this all the time! And
apparently they must, because I found this snippet on this blog
post:
import re, htmlentitydefs
##
# Removes HTML or XML character references and entities from a text string.
#
# @param text The HTML (or XML) source text.
# @return The plain text, as a Unicode string, if necessary.
def unescape(text):
def fixup(m):
text = m.group(0)
if text[:2] == "&#":
# character reference
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
else:
# named entity
try:
text = unichr(htmlentitydefs.name2codepoint[text[1:-1]])
except KeyError:
pass
return text # leave as is
return re.sub("&#?\w+;", fixup, text)
This didn't actually work because my text included '
which is not in the
htmlentitydefs.name2codepoint
dictionary. That fact led me to Python
Issue #11113 and an html5
dictionary that
included all of the desired entities. The issue indicated that the changes were
added in Python 3 somewhere along the line, and the html5
dictionary I
mentioned was available here.
At this point, my problem was solved. But I couldn't help but think that there
was a better way. After a little searching I found that something as simple as
the following solved my problem:
import HTMLParser
HTMLParser.HTMLParser().unescape(text)
All of this seemed like a good exercise in playing with Python, and seemed worth
recording.
I was asked to take a quick look at getting a PHP extension to work. Little did I know the can of worms that particular question would open.
Some of my coworkers were trying to evaluate a piece of PHP based support call and ticket tracking software called Kayako. This particular web application uses Zend Guard to protect its code. The application hasn't released a version using Zend Guard 5.5, which added support for PHP 5.3. This is quite important because I attempted to bring up a virtual machine to aid in my coworkers testing Kayako.
After getting the virtual machine setup with Ubuntu 10.04.1, using the OpenSSH Server and LAMP Server canned configuration options to get a good set of necessary, base packages for this particular use case, I did a safe-upgrade, rebooted (new kernel), and added the "zendframework" package. The next step was to add support for running Zend Guard encoded applications, which is provided by the Zend Guard Loader (formerly called the Zend Optimizer). This is done by way of a binary that is setup in the php.ini according to the included README:
[zend]
zend_extension=/opt/ZendGuardLoader-php-5.3-linux-glibc23-i386/php-5.3.x/ZendGuardLoader.so
zend_loader.enable=1
zend_loader.disable_licensing=0
A quick reload of Apache, and the "with Zend Guard Loader v3.3" appears in the phpinfo() page. However, since I used the current version of PHP, and the corresponding Zend Guard Loader -- I was doomed to fail. As Kayako was encoded with Zend Guard for PHP 5.2. Apparently, Zend Guard Loader for PHP 5.3 does not support loading files encoded for earlier versions. This, I discovered after searching for the error from my Apache error_log:
[Thu Feb 03 22:31:21 2011] [error] [client 192.168.10.26] PHP Fatal error: Incompatible file format: The encoded file has format major ID 3, whereas the Loader expects 4 in /var/www/kayako/setup/index.php on line 0
I guess that this shouldn't be so frustrating. But I do find it rather annoying that there is no backward compatibility. This is somewhat exacerbated by the fact that PHP 5.3 was first released in June 2009, nearly 2 years ago. Not only from the perspective of maintaining legacy versions (granted, 5.2 still seems to be actively maintained), but the amount of release lock-in that comes with not supporting loading older versions in newer releases means that customers of the Zend Guard encoder need to keep updating (or at least releasing up-to-date encoded versions of their supported releases. I think that the associated cost of protecting the PHP source with this mechanism is too high.
However, Zend to the rescue! I don't even need to pin the 5.2 series of PHP in my package managed Ubuntu environment. There is a Zend Server Community Edition which is a bundle that includes all of the necessary components to run a Zend Guard encoded application of either the previous (5.2) or current (5.3) version. This bundle is available for various platforms (including Linux and Windows) and certainly seems like the easiest way to get a Zend Guard application up and running without fighting version incompatibilities. But that isn't necessarily the easiest way to manage a particular web application, unless you subscribe to the single-service-per-machine philosophy.
So, while I find the whole situation very frustrating, specifically the whole no backward compatibility thing, I do think it's nice that the "community edition" bundle is available to get applications up and running.
Useful links from this endeavor: