[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Weird PPP trouble



Hi, everone.

Maybe someone can shed light on the strange behavior I am
experiencing with PPP. The symptom is that sometimes
after pppd is killed, nothing can subsequently open the
serial port it used -- unless the open call is
non-blocking, in which case everything works fine.

Specifically, this means that pppd (or more likely chat)
cannot open the serial port again after it dies, which is
fantastically aggravating to me, as I have cause to bring
the link up and down fairly frequently.

I have checked all the obvious things: permissions on
/dev/ttyS1 (a 14K AT&T internal) are fine; there are no
lock files lying around; nothing else has the port open.
I wrote some code that tried to open that port, and
that's how I discovered that

 open("/dev/ttyS1",O_RDWR | O_NDELAY)

works fine (and I really can read & write on the port
with the returned fd), but

 open("/dev/ttyS1", O_RDWR)

blocks forever. I also discovered that if I just start
Minicom up and let it open the port, which it does
successfully, and close it again, then everything's
cool, and pppd and chat work fine again. I stumbled
about in the Minicom code that configures the serial
port, and tried to write something that did the same
things, but it didn't work; apparently I'm missing
something...

But then I discovered another bizarre thing: the above
symptoms only occur when I dialup my company's terminal
server (a BitRunner, I think). When I dialup Mindspring,
everything is fine after the link comes down. So I poked
around in the logfile written by PPP, and found some stuff
in the BitRunner case that isn't present in the MindSpring
case, but I'm not sure what it means, or whether it has
any bearing on the problem. Here's an excerpt:

--- from /var/adm/messages/local2.debug ---

>Jan 17 22:07:40 whyknot pppd[75]: Setting itimer for 0
> seconds in untimeout.
>Jan 17 22:07:40 whyknot pppd[75]: ipcp: up
>Jan 17 22:07:40 whyknot pppd[75]: local IP address
> 134.0.201.51
>Jan 17 22:07:40 whyknot pppd[75]: remote IP address
> 134.0.201.50
>Jan 17 22:07:40 whyknot pppd[75]: Script /etc/ppp/ip-up
> started; pid = 78
>22:07:40 whyknot pppd[75]: rcvd [proto=0x21] 45 00 00 58
> 4c b8 00 00 1d 11 7c 37 86 00 c8 a5 86 00 ff ff 11 59 07
> db 00 44 70 72 00 00 00 02 45 44 44 53 5f 43 4c 5f 6d 6b
> 65 5f 30 33 00 44 40 00 be e0 00 00 4a 2f 10 00 00 80 10
> 00 00 00 40 01 01 40 00 00 00 04 00 00 00 18 7b 03 38
>Jan 17 22:07:40 whyknot pppd[75]: 80 30 fd 03 b2 00 07
> d5 73 9f 33 >Jan 17 22:07:40 whyknot pppd[75]: input:
>Unknown protocol (21) received!  >Jan 17 22:07:40 whyknot
>pppd[75]: sent [LCP ProtRej id=0xc 00 21 45 00 00 58 4c
> b8 00 00 1d 11 7c 37 86 00 c8 a5 86 00 ff ff 11 59 07 db
> 00 44 70 72 00 00 00 02 45 44 44 53 5f 43 4c 5f 6d 6b 65
> 5f 30 33 00 44 40 00 be e0 00 00 4a 2f 10 00 00 80<

--- end of included text ---

Aside from that, I can't find anything different between
the two cases. Specifically, the death of pppd is
the same, at least as far as I can tell from the logfile.
I bring the link down by just doing "kill <pid>" on the
pppd process; I think that is the right way to do it,
but let me know if it's not.

Other important info: my kernel is version 1.2.4,
unpatched.  My pppd is version 2.1.2.

Anyway, if anyone could give me the faintest clue about
this, I would really appreciate it.

Thanks,

-- Joe

* Corrolary to Clarke's Law: A sufficiently primitive *
* magic is indistinguishable from technology.         *