Computer Architecture: How do applications communi

2019-01-29 14:13发布

问题:

Prelude: This is admittedly a fairly broad question regarding computer architecture, but one that I hear from others and wonder about quite often myself. I also don't think that there is a direct or quick answer to this. However, I was hoping someone well-versed in systems architecture could provide some insight.

Some background: I am primarily a full-stack developer focusing mostly on web technologies and databases. I do have some background in C and tinkering with a good deal of low-level stuff, but that was a very long time ago and was non-academic. As such, I never got very deep into OS architecture, and this is one piece that eludes me. I am aware of various techniques and methods of accomplishing these tasks (especially on a higher level with technologies geared for this purpose), but am lacking a holistic picture/understanding of the low-level logistics of how this happens - particularly on an OS level.

The general question is: how do applications running inside of a "container" actually talk to the running instance of that container? By "container", I mean an instance of running code which is already loaded into memory (examples of such code could be an operating system, a graphics drawing interface, an application server, a driver, etc).

Also, this question applies only to compiled code, and to communication between systems running on the same machine.

For example

Let's say I build a simple library who's purpose is to draw a pixel on a screen. Let's also say this library has one method, drawPixel(int x, int y).

The library itself manages its own drawing context (which could be anything from a raw SVGA buffer to a desktop window). Applications using this API simply link dynamically against the library, and call the drawPixel method, without any awareness of the library's exact actions after the call.

Under the hood, this drawPixel method is supposed to draw to a window on the desktop, creating it if it doesn't exist on the first call.

However, technically what would happen if the setup was that straightforward & simple, is that each calling application would "pull & run" all of the code in drawPixel and its dependencies, effectively causing each running application to have its own running instance of the entire call chain (and thus, if it was called by 5 different applications, you'd end up with 5 different windows instead of a shared context to one window). (I hope I'm explaining this right)

So, my question is, how does this "sharing" happen in modern operating systems?

Would the code for drawPixel actually be replaced with IPC code? Or would it be regular graphics code, but somehow "loaded" into the OS in a way that there is one universally accessible running instance of it, which other applications call at-will?

Some cases I'm aware of

I know that there are many approaches to this issue, and am aware of a few of them. However, all of these seem to address specific niches and have shortcomings; none appear to be comprehensive enough to explain the incredible capabilities (regarding interconnectedness of OS & app services) of modern application ecosystems.

For example:

  • In the old (DOS) days, I believe app <-> OS communication was accomplished via system interrupts.
  • In the UNIX world, this is done via stdin/stdout pipes on the console, and a network protocol in X Windows.
  • There were IPC platforms like COM+/DCOM/DCOP/DBus on Windows & Linux, but again, these appear to be geared at a specific purpose (building & managing components at scale; predecessors of present-day SOA).

The question

What are some of the other ways that this kind of communication can be facilitated? Or, more specifically, how "is this done" in a traditional sense, especially when it comes to OS APIs?

Some examples of more specific questions:

  • How does a kernel "load" a device driver on boot, which runs its own code (in an isolated space?) but still talks to the kernel above it, which is currently running in memory? How does this communication happen?

  • How are windowing subsystems (with the exception of X and Quartz, which use sockets) talked to by applications? I think WIN32 used interrupts (maybe it still does?), but how does the newer stuff work? I'd be very surprised to find out that even in the present day, sophisticated frameworks like WPF or Metro still boil down to calling interrupts. I'm actually not sure that WIN32 APIs are even used by these systems.

  • What about lower-level graphics subsystems like GDI+ and the Linux Framebuffer?

Note: I think in the case of WIN32 (and possibly GDI+), for example, you get a pointer (handle) to a context, so the concept is effectively "shared memory". But is it as simple as that? It would appear pretty unsafe to just get a raw pointer to a raw resource. Meaning, there are things that protect you from writing arbitrary data to this pointer, so I think it is more complex than that.

  • (this might be a bit out of context as its JVM specific) How do servlets running inside an application server talk to the actual application server? Meaning, how do they load themselves "inside the context" of the currently running server?

  • Same question for IIS - How exactly is the plumbing set-up so that IIS can control and communicate back & forth with a separate process running an ASP.NET application?

Note: I am not sure if this question makes much sense and may admittedly be dumb or poorly-worded. However, I was hoping that my point came across and that someone with a systems background could chime in on the standard "way of doing things" when it comes to these scenarios (if there is such a thing).

Edit: I am not asking for an exhaustive list of IPC methods. There is a specific concept that I am trying to find out about, but I am not familiar with the correct terminology and so am having trouble finding the words to pinpoint it. This is why this question comes with so many examples, to "eliminate" the parts that the question does not target.

回答1:

Too broad question, but some points (related to Linux; the principles should be the same for Windows, but you probably are forbidden to understand all of it) :

The elementary system calls (those listed in syscalls(2)...) are invoked by an elementary machine instruction (e.g. SYSENTER or SYSCALL) which switches the processor into kernel mode (with the system call number and arguments passed through defined registers, following the ABI convention). Hence user-space code can be viewed as running in some virtual machine (defined by user-mode instructions + the system call primitives). BTW the Linux kernel can load kernel modules to e.g. add additional code (such as device drivers) in it, and that is done also thru system calls.

The inter-process communication facilities are built above these system calls (perhaps used by the standard library in higher level functions, e.g. getaddrinfo(3) might interact indirectly with some DNS service, see nsswitch.conf(5)). Read Advanced Linux Programming for more details. In practice you'll need several server programs (and that idea is pushed to its extreme in microkernel approaches), notably (on recent Linux) systemd. Drivers and kernel modules are loaded by specific system calls and later are part of the kernel so are usable thru other system calls. Play with strace(1) to understand the actual system calls done by some Linux program. Some information is provided by the kernel thru pseudo file systems (see proc(5)...) accessible thru system calls.

Every communication from user program to kernel is done by IPC (implemented by system calls). Sometimes, the kernel is doing an upcall to user code (on Linux, with signals).

The Linux framebuffer (and the physical keyboard & mouse) is generally only accessed by a single server which other desktop applications communicate with using usual IPC facilities -sockets-, that server is the X11 or Wayland server.

Read also some good book on Operating Systems, e.g. the freely downloadable Operating Systems: Three Easy Pieces

For Windows, MacOSX, Android, it is very similar. However, since Windows (etc...) is a proprietary software, you might not be able to know all the details (and you might not be allowed to reverse-engineer them). In contrast, Linux is free software, so you can study its source code.

My advice would be to understand in details how Linux work (this would take several years) and study some relevant source code (which is possible for free software). If you need an deep understanding of Windows, you might need to buy some source code license of it (probably millions of dollars) and sign an NDA. I don't know Windows at all, but AFAIK it is only defined by a huge API in C. Rumors tell that the Windows kernel is microkernel like, but Microsoft has economical interest to hide ugly implementation details.

See also osdev.