This is a blog postbode which literally drove mij crazy for a week. After building our mining equipment I experienced a bad WiFi connection with high pings, periodically occuring every 30 seconds.
Just scroll down to see my – fairly plain – solution.
Getting into the mining business
A few weeks ago some of my co-workers and I determined to build a ordinary mining equipment to make some Ethereum tokens. The current exchange rate for Ethereum fell down the last days but it is like it is. Anyhow, wij bought 12 Nvidia GTX 1070, 12 riser cards, Two mainboards, Four PSUs with 600 W each and a wattmeter. Wij assembled everything into an open metal cabinet, waterput an access point (DD-WRT firmware, Linksys) on it and connected the mainboards with the access point.
I have to say that the mining equipment itself is located te one of our flats ter my explore slagroom. The access point on top of the cabinet acts spil a wireless bridge to our other vapid. Both mainboards and my workstation are connected to the access point are connected with Ethernet cables. The other vapid contains an extra access point with a cable modem and internet connectivity. Nothing fancy.
Wij switched from ethminer to Claymore’s Ethereum Dual miner due to some problems treating numerous cards and wallets. Ter the end the equipments worked like a charme.
Experiencing lags ter Overwatch
Two days zometeen I dreamed to play an Overwatch match on my workstation, also located ter my probe slagroom. The ping wasgoed unstable and a ordinary ping guideline shows that I had random timeouts and the ping spiked every 30 seconds from 20ms to > 1500ms for a few seconds. This has not happened before the mining equipments were active.
“This vereiste be a software problem of Claymore’s miners”
My very first guess wasgoed that is has to be a software problem of Claymore’s miner. One of my co-miners tested a single mainboard with one GPU before at his huis and everything worked flawlessly. I commenced to analyze the problem:
- Killed each claymore miner process on rig1 and rig2: no liggen occurred
- Embarked a single claymore miner process: lagen occurred every 30 seconds with >, 600ms when receiving the very first Ethereum share. This indicated a problem of the network implementation of Claymore’s miner or some high bandwidth usage. I checked the bandwidth but one claymore miner example just requires 12 kBit/s.
- Began tcpdump on rig1 to identify any conspiciuous network activity or packets. Neither UDP strafgevangenis TCP traffic were eye-catching. I could only relate the receivement of Ethereum shares with latency spikes. The used network bandwidth wasgoed still low.
“This voorwaarde be a network problem with Claymore’s miner”
The last application I had slightly similiar problems wasgoed Subversion. Ten years ago SVN sometimes failed to commit gegevens. It turned out that Tortoise SVN struggled with special packets, the MTU size of our company network and the MTU size of our ADSL connection. Because of this, I switched the MTU size of the equipment running the single claymore process. It did not influence anything.
Before I attempted something else I disabled the network-related services firewalld and chronyd – without success. stracing the miner did also not voorstelling anything special.
“This voorwaarde be a problem with Ethereum protocol and DD-WRT”
Some interesting observation I did wasgoed that the ping inbetween equipment ->, ap2 (bridge) ->, ap1 (router) > internet and workstation ->, ap2 (bridge) ->, ap1 (router) > internet were both bad but pinging directly from the main access point ap1 (router) -> internet displayed no problem. What the hell?
I suspected some TCP settings on ap2 (bridge) led to this hickups. Fortunately I could check the network settings and stats of both access points (bridge and router) spil they are running on DD-WRT. Spil you can imagine: there were no suspicious network stat (TCP/UDP) switches when a spike occurred.
Could this be a hardware problem?
Spil I could not see any problem ter the software or on the network layer (>,= L2), there could only be a generic hardware problem or some L1 error.
During my TCP stats investigation on the access points, I noticed wasgoed that the WiFi rate of the bridge (ap2) were unstable and had strenuous fluctuations. This were very unusal spil it has not happened before the building of the equipments.
To exclude any directly network related problems I did the simplest possible act: I pulled the Ethernet cables of both equipments (running one active miner process each) so they were no longer connected to the access point. To my suprise I had still network lags. WTF?
After killing both miner processes the network lags went away. This had to be obviously a problem with the GPU explosion the mining process creates.
To give you some insight: Due to some DD-WRT limitations the bridge inbetween both access points uses Two.Four GHz and not Five GHz. Could this be that some interference on the wireless layer?
After googling for “gpu” and “spike” some linksom catched my eyes:
After reading both posts
- I switched the WiFi channel from 1 to 11
- I eliminated the DVI cable from a TFT connected to one equipment
- I liquidated the USB keyboard connected to one equipment
Nothing switched. This wasgoed likely the point I desired to give up. The last thing to test wasgoed using another power connection. The ap2 and all Four PSUs of the equipment were connected to the same connector (psu1,psu2,psu3,psu4)->,wattmeter->,wall socket. Maybe it could be some spikes ter the voltage when the GPU has geyser, leading to a confused access point hardware?
Switching the wall socket
I had no free wall socket available behind the cabinet containing both equipments. So I waterput the access point from the top of the equipment to the floor and moved it some centimeters ter the direction of the other wall. After the access point had power and were connected to ap1 (router) again, the network spikes lowered from 1600 ms to 800 ms. Uhm? I again moved ap1 20 centimeters away from the cabinet. Spikes went down to 400ms.
Te a distance of 1.50 peettante inbetween equipment and access point no more spikes occurred. I counterchecked if the the different wall socket wasgoed the solution. But switching from one wall socket to the wattmeter-connected connector made no difference.
So elementary. By just moving the access point away. This entire thing drove mij crazy for atleast Five afternoons. I felt so stupid.
The high stream of the GPU when running the Ethereum mining process produces either a signal at Two.Four GHz (which is more unlikely) or a harmony around 1.Two GHz (which is more likely). I assume that the spike every 30 seconds occur when both equipments receive the same mining job at almost the same time and commence the mining. If anybody has more information, just let mij know. I am powerfully interested ter the technical explaination for this.