Nidhish ABRAHAM's - "Tech Talk": August 2008

Sunday, August 24, 2008

Sockets

Sockets

A socket is a bidirectional communication device that can be used to communicate with another process on the same machine or with a process running on other machines.

Sockets are the only interprocess communication that permit communication between processes on different computers. Internet programs such as Telnet, rlogin, FTP, talk, and the World Wide Web use sockets.

For example, you can obtain the WWW page from a Web server using the Telnet program because they both use sockets for network communications.

To open a connection to a WWW server at www.nidhish.co.nr, use
telnet www.nidhish.co.nr 80.The magic constant 80 specifies
a connection to the Web server programming running www.nidhish.co.nr instead of some other process.

Try typing GET / after the connection is established.This sends a message through the socket to the Web server, which replies by sending the home page’s HTML source and then closing the connection.

Networks

Most network application can be divided into two pieces: a client and a server. A client is the side that initiates the communication process, where as the server responds to incoming client requests.

There are numerous network protocols, such as Netbios, RPC (Remote Procedure Call), DCOM, Pipe, IPC (Inter-process Communication) that can be used for the Comm Link. We will only look at TCP/IP. In particular we will look at IPv4 since this is widely implemented by many socket vendors.

TCP Transmission Control Protocol

Although TCP can be implemented to work over any transport protocol, it's usually synonymous with IP. TCP is a connection -oriented stream protocol (like a telephone call). TCP communication happens using a handshake process, where each data that is sent is acknowledge by the recipient within the time of TCP’s timer value. TCP provides many services such as data reliability, error checking, and flow control. If a data packet is corrupt or lost (not acknowledged), TCP will retransmitted the data from the client side automatically. Because the route a packet takes can be many, one packet may arrive before the one sent earlier. As data packets arrive, it is the job of TCP to assemble the packets into the proper order. This is shown below with a factious network topology layout, where the data packet takes (n) number of hops to get from the source to the destination. On a bigger network like the Internet, there are many routes a data packet can take to arrive at its final destination.

Network Hop Topology

Byte Ordering

There are two types of memory byte ordering in use today that are very much machine dependent. They are known as little-endian and big-endian, because of this we have to be very careful how we interpret numerical data. If we do not take into account the endiannes, the numerical data we read will be corrupt.

When working with numeric data, one needs to convert from machine (host) byte order to network byte order when sending data (write-op), and then from network byte order to machine byte order when retrieving data (read-op). The APIs to make the conversion are:

htons()and htonl()
//host to network

uint16_t htons(uint16_t host16bitvalue);
uint32_t htonl(uint32_t host32bitvalue);

ntohs() and ntohl()
//network to host

uint16_t ntohs(uint16_t net16bitvalue);
unit32_t ntohl(unit32_t net32bitvalue);

Saturday, August 16, 2008

Thread Creation

Thread in a process is identified by a thread ID. When referring to thread IDs in C or C++ programs, use the type pthread_t. Each thread executes a thread function.

This is just an ordinary function and contains the code that the thread should run. When the function returns, the thread exits. The pthread_create function creates a new thread.

You provide it with the following:
1. A pointer to a pthread_t variable, in which the thread ID of the new thread is stored.

2. A pointer to a thread attribute object.This object controls details of how the thread interacts with the rest of the program. If you pass NULL as the thread attribute, a thread will be created with the default thread attributes.

3. A pointer to the thread function.This is an ordinary function pointer, of this type: void* (*) (void*)

4. A thread argument value of type void*. Whatever you pass is simply passed as the argument to the thread function when the thread begins executing.

#include <pthread.h>
#include <stdio.h>

/* Prints x’s to stderr. The parameter is unused. Does not return. */

void* print_xs(void* unused)
{
while(1)
{
fputc (‘x’, stderr);
}
return NULL;
}

/* The main program. */
int main()
{
pthread_t thread_id;

/* Create a new thread. The new thread will run
the print_xs function. */

pthread_create(&thread_id, NULL, &print_xs, NULL);

/* Print o’s continuously to stderr. */
while(1)
{
fputc(‘o’, stderr);
}
return 0;
}

Compile and link this program using the following code:

> cc -o thread-create thread-create.c -lpthread

Sunday, August 10, 2008

Inter-Process Communication (IPC) Methods

1. File - (All operating systems)

A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished. Computer files can be considered as the modern counterpart of paper documents which traditionally were kept in offices' and libraries' files, which are the source of the term.

2. Signal - (Most operating systems; some systems, such as Windows, only implement signals in the C run-time library and do not actually provide support for their use as an IPC technique)

A signal is a limited form of inter-process communication used in Unix, Unix-like, and other POSIX-compliant operating systems. Essentially it is an asynchronous notification sent to a process in order to notify it of an event that occurred. When a signal is sent to a process, the operating system interrupts the process' normal flow of execution. Execution can be interrupted during any non-atomic instruction. If the process has previously registered a signal handler, that routine is executed. Otherwise the default signal handler is executed.

3. Socket - (Most operating systems)

Sockets are just like "worm holes" in science fiction. When things go into one end, they (should) come out of the other. Different kinds of sockets have different properties. Sockets are either connection-oriented or connectionless. Connection-oriented sockets allow for data to flow back and forth as needed, while connectionless sockets (also known as datagram sockets) allow only one message at a time to be transmitted, without an open connection. There are also different socket families. The two most common are AF_INET for internet connections, and AF_UNIX for unix IPC (interprocess communication).

Of the various forms of IPC (Inter Process Communication), sockets are by far the most popular. On any given platform, there are likely to be other forms of IPC that are faster, but for cross-platform communication, sockets are about the only game in town.

4. Pipe - (All POSIX systems)

In Unix-like computer operating systems, a pipeline is the original software pipeline: a set of processes chained by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) of the next one. Each connection is implemented by an anonymous pipe. Filter programs are often used in this configuration.

5. Named pipe - (All POSIX systems)

In computing, a named pipe (also FIFO for its behavior) is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication. The concept is also found in Microsoft Windows, although the semantics differ substantially. A traditional pipe is "unnamed" because it exists anonymously and persists only for as long as the process is running. A named pipe is system-persistent and exists beyond the life of the process and must be "unlinked" or deleted once it is no longer being used. Processes generally attach to the named pipe (usually appearing as a file) to perform IPC (inter-process communication).

6. Semaphore - (All POSIX systems)

A semaphore is nothing but a term used in UNIX for a variable which acts as a counter. So the next question that comes in mind is what for we need this variable. It’s so simple.

For instance there may be times when two processes try to access the same file simultaneously. In this event we must control the access of the file when the other process is accessing. This is done by assigning value to semaphore.

A semaphore, in computer science, is a protected variable (an entity storing a value) or abstract data type (an entity grouping several variables that may or may not be numerical) which constitutes the classic method for restricting access to shared resources, such as shared memory, in a multiprogramming environment (a system where several programs may be executing, or taking turns to execute, at once).

Semaphores exist in many variants, though usually the term refers to a counting semaphore, since a binary semaphore is better known as a mutex. A counting semaphore is a counter for a set of available resources, rather than a locked/unlocked flag of a single resource.

Semaphores are the classic solution to preventing race conditions in the dining philosophers problem, although they do not prevent resource deadlocks.

7. Shared memory - (All POSIX systems)

What is Shared Memory?

Shared Memory is an efficient means of passing data between programs. One program will create a memory portion which other processes (if permitted) can access. Shared memory is another method of interprocess communication (IPC) whereby 2 or more processes share a single chunk of memory to communicate. The shared memory system can also be used to set permissions on memory, allowing for things like malloc debuggers to be written.

Types of Shared memory available

Basically there are two different types of shared memory available for most flavors of UNIX. As you may have guessed, each of the two original ancestors of modern UNIX have their own implementation, although almost all modern UNIX flavors implement both. The names of the respective implementations are System V IPC, and BSD mmap.

8. Message queue - (Most operating systems)

In computer science, a message queue is a software-engineering component used for interprocess communication or inter-thread communication within the same process. It uses a queue for messaging – the passing of control or of content. Group communication systems provide similar kinds of functionality.

Message queues provide an asynchronous communications protocol, meaning that the sender and receiver of the message do not need to interact with the message queue at the same time. Messages placed onto the queue are stored until the recipient retrieves them.

Examples of commercial implementations of this kind of message queueing software (also known as Message Oriented Middleware) include IBM's WebSphere MQ (formerly MQ Series), Oracle Advanced Queuing (AQ) within an Oracle database, and Microsoft's MSMQ. There is a Java standard called Java Message Service, which has, associated with it, a number of implementations, both proprietary and free software.

Synchronous vs. asynchronous

Many of the more widely-known communications protocols in use operate synchronously. The HTTP protocol – used in the World Wide Web and in web services – offers an obvious example.

In a synchronous model, one system makes a connection to another, sends a request and waits for a reply.

Inter-Process Communication (IPC) in Unix

Inter-Process Communication (IPC) is a set of techniques for the exchange of data among two or more threads in one or more processes. Processes may be running on one or more computers connected by a network. IPC techniques are divided into methods for message passing, synchronization, shared memory, and remote procedure calls (RPC). The method of IPC used may vary based on the bandwidth and latency of communication between the threads, and the type of data being communicated.

IPC may also be referred to as inter-thread communication and inter-application communication.

Friday, August 8, 2008

VIDEO - Beginning C++

Wednesday, August 6, 2008

What is a UNIX Process

What is a UNIX Process

A process is an instance of a computer program that is being sequentially executed by a computer system that has the ability to run several computer programs concurrently.

Unix Process

An entity that executes a given piece of code, has its own execution stack, its own set of memory pages, its own file descriptors table, and a unique process ID.

The fork() system call

The fork() system call is the basic way to create a new process. It is also a very unique system call, since it returns twice(!) to the caller.

Example:

#include <unistd.h>
#include <sys/wait.h>

pid_t child_pid;
int child_status;

/* lets fork off a child process... */
child_pid = fork();

/* check what the fork() call actually did */
switch (child_pid) {
case -1: /* fork() failed */
perror("fork"); /* print a system-defined error message */
exit(1);
case 0: /* fork() succeeded, we're inside the child process */

printf("Hello, World\n");
exit(0); /* here the CHILD process exits, not the parent. */
default: /* fork() succeeded, we're inside the parent process */
wait(&child_status); /* wait till the child process exits */
}
/* parent's process code may continue here... */

Notes:

* The perror() function prints an error message based on the value of the errno variable, to stderr.

* The wait() system call waits until any child process exits, and stores its exit status in the variable supplied. There are a set of macros to check this status, that will be explained in the next section.

Thread - a thread of execution

What are threads?

A thread is a sequence of instructions to be executed within
a program. Normal UNIX processes consist of a single thread
of execution that starts in main(). In other words, each line
of your code is executed in turn, exactly one line at a time.
Before threads, the normal way to achieve multiple
instruction sequences (ie, doing several things at once, in
parallel) was to use the fork() and exec() system calls to
create several processes -- each being a single thread of
execution.

In the UNIX operating system, every process has the following:

1. a virtual address space (ie, stack, data, and code segments)

2. system resources (eg, open files)

3. a thread of execution

These resources are private to each process; for example, processes
cannot peek into one another's address space, open and close
files for one another, etc.

Monday, August 4, 2008

C++ program - Hello World!

Program 2. C++ source file—main.cpp

#include <iostream>
using namespace std;

int main()
{
cout << "Hello World!" << endl;

}

The main function serves a special purpose in C++ programs:

* The run-time environment calls the main function to begin
program execution.

* The opening curly brace indicates the beginning of the
definition of the main function.

* To print a value to the screen, write the word cout, followed by the insertion operator (<<), which you create by typing the less-than character (<) twice. Even though this is two characters, C++ treats it as one.

* A commonly-used alternative to the newline character \n is endl

* The return statement terminates the execution of the main
function and causes it to return the integer value 0, which is
interpreted by the run-time system as an exit code indicating
successful execution.

* The closing curly brace indicates the end of the code for
the main function.

Compiling a single C++ source file:

The C++ compiler is called g++.

> g++ main.cpp -o main.cpp.o

Output:

Introduction to UNIX programming

THE BASIC steps required to create a C or C++ Linux program.

* How to create a C and C++ programs on Linux environment
* Compiling code
* Debugging the result

Program 1. C source file—main.c

#include <stdio.h>

int main(void)
{
printf("\nHello, World\n\n");
return 0;
}

The main function serves a special purpose in C programs:

* The run-time environment calls the main function to begin
program execution.

* The opening curly brace indicates the beginning of the
definition of the main function.

* A function named printf, which was declared in
stdio.h and is supplied from a system library

* The return statement terminates the execution of the main
function and causes it to return the integer value 0, which is
interpreted by the run-time system as an exit code indicating
successful execution.

* The closing curly brace indicates the end of the code for
the main function.

Compiling a single C source file:

The C++ compiler is called g++.

> g++ main.c -o main.c.o

Output:

My Home town - Kottayam, Kerala, INDIA

View Larger Map