[ZODB-Dev] zeo2a1 performance cold spots

Toby Dickenson tdickenson@geminidataloggers.com
Sat, 29 Jun 2002 20:26:14 +0100


On Friday 28 Jun 2002 7:23 pm, Jeremy Hylton wrote:
> > Ah, no, I was right the first time. FORCE_PRODUCT_RELOAD was
> > causing excess
> > ZEO traffic, but the traffic was still slower than it should be.
> > As far as I
> > can tell this is due to ZEO not turning on TCP_NODELAY, which
> > adds a little
> > latency to every request.
>
> I'm not particularly expert in TCP buffering or the Nagle algorithm, bu=
t it
> doesn't seem like it would add latency to every request.

I can almost explain it with your modified smac.py. Each request and resp=
onse=20
goes in two (or more) packets. The first packet contains the 4 byte lengt=
h.=20
Nagles algorithm means that the second packet wont be sent immediately - =
TCP=20
delays it until the first packet has been acked. I can see this pretty=20
clearly on linux using tcpdump.

What I cant explain is why it takes so long to send back that crucial ack=
=2E=20
Even over the loopback interface I see it takes 34ms to ack the 4 byte pa=
cket=20
(but much less to ack the larger subsequent packet)

With my original patch to turn off Nagle the two packets go out instantly=
=2E=20
But, still in two seperate packets.

> I know I made one small change at the SMAC layer.  When a message is be=
ing
> output, there are now two different strings appended to the list of pen=
ding
> messages.  The first string is the length of the second string.

Yes, you are exactly right. Modifying handle_write to aggregate small blo=
cks=20
(patch below) causes TCP to send most requests and responses in one packe=
t.=20
This is definitely a good thing.

> However, I'm not particularly thrilled with the idea of disabling the N=
agle
> algorithm.  It seems to reduce latency in your benchmark, but it may al=
so
> greatly increased the number of packets sent. I would rather understand
> what about ZEO has changed to cause the effect to be visible.

Nagle's job is to aggregate many small writes into one packet. Now that Z=
EO is=20
doing the same thing, there is very little to be gained by not using=20
TCP_NODELAY. I will commit both patches later unless someone can see any=20
disadvantage to TCP_NODELAY.

There are definitely some other problems that Nagles algorithm could caus=
e for=20
ZEO. Consider if the response to a zeoLoad request is one byte too large =
for=20
one packet: we want TCP to immediately send two packets. With Nagle enabl=
ed,=20
TCP would briefly delay sending the last small packet in case there was m=
ore=20
data to follow. In our case, we know that delay is futile.



Index: smac.py
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvs-repository/Packages/ZEO/smac.py,v
retrieving revision 1.17
diff -c -2 -r1.17 smac.py
*** smac.py     11 Jun 2002 13:43:06 -0000      1.17
--- smac.py     29 Jun 2002 19:01:13 -0000
***************
*** 139,142 ****
--- 139,145 ----
          while output:
              v =3D output[0]
+             while len(output)>1 and len(v)<16384:
+                 del output[0]
+                 v +=3D output[0]
              try:
                  n=3Dself.send(v)