Shell
Some links and notes:
Text manipulations
In this exercise, we will manipulate the famous intimate mail exchange between George Sand and Alfred de Musset.
- We first want to keep only the odd lines of the following text:
Je suis très émue de vous dire que j'ai
bien compris l'autre soir que vous aviez
toujours une envie folle de me faire
danser. Je garde le souvenir de votre
baiser et je voudrais bien que ce soit
là une preuve que je puisse être aimée
par vous. Je suis prête à vous montrer mon
affection toute désintéressée et sans cal-
cul, et si vous voulez me voir aussi
vous dévoiler sans artifice mon âme
toute nue, venez me faire une visite.
Nous causerons en amis, franchement.
Je vous prouverai que je suis la femme
sincère, capable de vous offrir l'affection
la plus profonde comme la plus étroite
en amitié, en un mot la meilleure preuve
dont vous puissiez rêver, puisque votre
âme est libre. Pensez que la solitude où j'ha-
bite est bien longue, bien dure et souvent
difficile. Ainsi en y songeant j'ai l'âme
grosse. Accourrez donc vite et venez me la
faire oublier par l'amour où je veux me
mettre.
To do that, you can use the following guide:
- Save this text in a file
- Use the command
nl to number the lines of the file
-
As we are only interested in the parity of the line, we only care of the last digit of each number ; remove the useless column with the function
colrm
- Use the command
grep to keep only the odd lines.
- Remove the useless columns to print the result
-
Do the same process by using only a one line command (you'll need to use the pipe | character)
- The answer from Musset was an acrostic you get the meaning by reading the first word of each line:
Quand je mets à vos pieds un éternel hommage
Voulez - vous qu'un instant je change de visage ?
Vous avez capturé les sentiments d'un cour
Que pour vous adorer forma le Créateur.
Je vous chéris, amour, et ma plume en délire
Couche sur le papier ce que je n'ose dire.
Avec soin, de mes vers lisez les premiers mots
Vous saurez quel remède apporter à mes maux.
Bien à vous, Eric Jarrigeon
The answer of Sand was on the same kind:
Cette insigne faveur que votre cœur réclame
Nuit à ma renommée et répugne à mon âme.
After saving these two texts in separated files, first create files containing only the first column of each answer ; this can be done with the cut function (for instance, one can use " " as a delimiter, and take only the first field). Then, create a single file containing these two columns written in two separated line. Finally, make the whole operations in a single line (the function echo, with the correct option, may help).
Around the find function
A
phylogenetic tree is a "tree chowing the inferred evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical and/or genetic characteristics". A part of the tree that you can find on the wikipedia page has been made in a file
phylogenetic-tree.tar.bz2. But other data, that have nothing to do with biology, have also been introduced. The point of the exercise will be to find them and follow the instructions hidden in the tree.
-
Use the command
wget to get the file arbre_phylogenetique.tar.bz2 that you can find at the urlhttp://www.lifl.fr/~poteaux/fichiers/phylogenetic-tree.tar.bz2 in the repertory of your choice.
- What is the size of this file ?
- Uncompress the file by using the command
bunzip2.
- What is the size of the file you get ?
- This is an archive file. You need to extract it by using the
tar command.
-
Find the file
charlie_01.txt in the tree by using the command find, read it, and follow the instructions inside.
-
Now that you know a little more what is inside the original tree, do you see a way to find all the necessary files in only one search ?
A few scripts
- Without using
$@, $* or $#, write a script that print all its parameters line by line
- Write a script that print the files inside a given directory line by line, with the type of the file written before (on a separated line). The script must test if there is a single parameter and that it is indeed a directory ; if not, this must print how to use the function
- Write a script that copy all the files with the extension
.txt in a repertory given as an input, and that replace the extension by .old
Creation of a trash
We want to write a command
can manage a trash for files. This trash will be contained in the directory
$HOME/.trash. If this directory does not exist yet, the comand will have to create it.
-
Create the command
can, that will accept 3 options:
-
can -l list the trash;
-
can -r purge the trash;
-
can filenames send filenames to the trash;
- Add the option
can -x filenames that remove directly the files filenames, without going through the trash (the command shift may help)
Getting informations on the memory
The external command
ps list the process currently running on the computer. With the options -ux, one get informations on the process of the current user: for each process, its PID, the percentage of CPU and memory and the size in kilobytes used
VSZ, and a few more informations.
- Write a script
memtot that compute the sum in kilobytes of the memory used by the user's process
- Write a script
topuser that print the user taking the most CPU ; we will consider the round part of each percentage only
Generating files
-
Write a script that, given a file
cmd.ext (ext can be any extension name) containing shell commands outputs a text file out.ext that contains the commands we would get. For instance, if one runs this script on a file cmd.txt that contains:
A=4
echo A
echo $A
B=5
expr A+B
one should create a file out.txt like that:
$ A=4
$ echo A
A
$ echo $A
4
$ B=5
$ expr A+B
A+B
-
Try your script for these commands:
ls thisfiledoesnotexist
echo $?
Run also them on a shell. Do you get the same result ? Why ? How to correct this ?
C
- Some lecture notes (work in progress, that will be updated quite often)
First programs
- Write a program that read two integers and print their gcd.
- The hamming weight of a number is the number of non zero bytes its binary form contains.
- Write a program that computes the Hamming weight of an integer (type
int) in a number of operations proportional to the size of the input.
- How to get a number of operations proportional to the Hamming weight of the input ?
- What does the following code ?
main(){int x,p=0;scanf("%lu",&x);for(;x>0;p++) x&=x-1;printf("%d\n",p);}
-
Write a program that, given a value x with type
double, computes the evaluation of the polynomial P(X)=an Xn + ... + a1 X + a0 at X=x. The degree n, the coefficients ai and x will be inputed by the keyboard. To compute the evaluation of the polynomial, we will use the Horner scheme (we first compute an x + an-1, then an x2+an-1x + an-2 ...).
Plotting time computations
In this exercise, we will compute an approximation of the exponential function by using its truncated Taylor expansion. We recall that we have
exp(X)=1+X+X2/2+X3/6+X4/4!+....
The GDB debugger
We consider the following code:
# define len 10
int main ()
{
int tab[len];
unsigned int i;
for (i=len-1; i>=0; i--)
tab[i]=i;
return 0;
}
-
By reading it, what do you think this code does ? Compile it and run the executable. What's happening ?
-
To understang what's going on, we will use the debugger
gdb. To use it, we need to add the option -g of gcc when compiling. Then one runs $ gdb executable_name. One can also use the command M-x gdb of Emacs. This leads to the gdb prompt:
(gdb)
Here are some gdb commands we can use at this point:
-
(gdb) run (shortcut r) runs the executable,
-
(gdb) print i (shortcut p) prints the variable i,
-
(gdb) display i prints the value of i at each step used,
-
(gdb) whatis M gives the type of the variable M,
-
(gdb) call function(arguments) prints the result of the function call for given values,
-
(gdb) break 9 (shortcut b) makes a break when reaching the line 9 of the program. One can add conditions to the break command (for instance, (gdb) break 9 if i=0 || j != 3),
-
(gdb) b file.c:9 makes a break when reaching the line 9 of file.c,
-
(gdb) continue (shortcut c) run the program up to the next break,
-
(gdb) next (shortcut n) enables to run a program step by step after reaching a break,
-
(gdb) quit (shortcut q) quits gdb.
Run gdb and follow the variable i. When i reachs 0, what happens next ? Why ?
- How to solve the problem ?
Sorting a table
Write an algorithm that sort elements of a table, which will have a constant number of value (therefore definied by somthing like
int tab[N] where we define N in the preprocessive directives). The value of the table will be generated randomly by the function
rand. It is initialized by the
srand function from the number of seconds since January 1st, 1970 as:
srand((unsigned int) time(NULL));
To use not too big values, one should take the random numbers modulo an integer of our choice.
The algorithm should print something like:
Initial table
23 51 26 34 15
Sorted table
15 23 26 34 51
Correction:
There are a lot of sorting algorithms. Here I present two of them. At the i-th step of the algorithm, the first i elements of the table are sorted. The quicksort algorithm takes a pivot at each step and reorder the table so that all the elements greater than the pivot are after it; then, we apply the same algorithm to the two sublists. The second one is really faster in practice. You can see that by running value for high values of N (but then one should comment the tables printings)
File main.h:
# ifndef MAIN_H
# define MAIN_H
# include
# include
# include
# include "sort.h"
# define N 40 /* size of the table */
# define M 100 /* random integers will be taken in [0,M-1] */
clock_t initial_time; /* Initial time in micro-seconds */
clock_t final_time; /* Final time in micro-seconds */
float cpu_time; /* Total time in seconds */
extern void mytabprint(int[], int);
# endif /* MAIN_H */
File main.c:
# include "main.h"
void mytabprint(int tab[], int size)
{
int i;
for (i=0; i
File sort.h:
# ifndef SORT_H
# define SORT_H
extern void swap (int[], int, int);
extern void insertion_sort(int[], int);
extern void quicksort(int[], int, int);
# endif /* SORT_H */
File sort.c:
# include "sort.h"
void swap(int t[], int i, int j)
{
int tmp=t[i];
t[i]=t[j];
t[j]=tmp;
}
void insertion_sort(int tab[], int size)
{
int i,j;
for (i=0; ii; j--)
if (tab[j]
Numerical computations
Create a table tab of 20000 elements containing random numbers with the following rule:
tab[4*i] contains a float between 0 and 1112465223
tab[4*i+1] contains a float between 0 and 5
tab[4*i+2] contains a float between 0 and 5000
tab[4*i+3] contains a float between 0 and 1000000
-
Add the elements of the table as a float and print the result.
-
Sort the table and make the sum again. Do you get the same result ? Why ?
-
Make the two same computation by using long double and interpret the result.
Correction:
Here we are using the quicksort function of the last exercise, but with the following structure:
extern void swap (float[], int, int);
extern void quicksort(float[], int, int);
File main.h:
# ifndef MAIN_H
# define MAIN_H
# include
# include
# define N 20000
# define A 5001
# define S 6
# define B 1000001
# define VB 1112465224
# include "sort.h"
# endif /* MAIN_H */
File main.c:
# include "main.h"
int main()
{
int i;
float res=0;
float tab[N];
srand((unsigned int) time(NULL));
for (i=0; i
By running it, we see that we do not get the same result before or after sorting. The second result is greater. This is due to the representation of float numbers, and more precisely here how the addition of two floats numbers wit different exponent parts is dealt.
Some macros
-
Define with macros two "constants"
TRUE and FALSE
-
Write a macro
DIGORCAP(c) that test if a character given as parameter is a digit or a capital letter.
-
What does the following algorithm:
# include
# define ISDIG(c) (((c)>='0')&&((c)<='9'))
int main()
{
unsigned int=res;
int c;
while (c=getchar(), ISDIG(c))
{
res*=10;
res+=c-'0';
}
printf("%d",res);
return 0;
}
What is the difference with this one (you can compile it with the -E option to see more):
# include
# define ISDIG(c) (((c)>='0')&&((c)<='9'))
int main()
{
unsigned int=res;
int c;
while (ISDIG(c=getchar())
{
res*=10;
res+=c-'0';
}
printf("%d",res);
return 0;
}
Which is the "good" one ?
Structure manipulation
Dealing with big positive integers
-
Create a new type that represent a big positive integer (
GPI) as a vector of GPI_SIZE digits which are unsigned long
-
Write a function that add two
GPI (one have to take into account the carry for the digits).
Playing cards
Pointers
Managing a library
A book will be represented by its title, the name of its author, and a recording number. We will use the following structure:
struct book_s{
char* title;
char* author;
int num;
};
We will call library a set of books. As this exercice will be quite long, it is recommended to use a menu to try the differents functions all along the coding. Also, remember to use a makefile !
-
You can find here a file containing a set of books (one by line ; these are just random date). Create a function that can read
n books from the file and print them to the screen.
-
For the moment, we will code a library as a table of pointers on struct. Therefore, we will define a table of size TMAX, where TMAX is a constant big enough; at the beggining, each element at the table should contain the NULL value. Together with the table, one will store an integer giving the number of books stored in the library. Adding a book to the library will be made by storing it in the first free case of the table. The supression will be done by disallocating the case containing the book, and putting NULL instead.
-
Change the function of question 1 to construct one that enables to read n data in the file and store them in the library.
-
The books are numbered from 0 to 10000 in the file. May this be helpful according to the structure we're using ?
-
Create functions to
- find a book by its number,
- find a book by its title,
- find all the books from the same author,
- insert a new book,
- suppress a book,
- know if there is a book stored twice.
-
We want to compare the necessary time to manage the three functions of insertion, searching a book, and looking for a double occurence. Create a test function that uses the algorithms of the previous question and store the cpu time it takes. For the searching, we will use a book that is not in the list (so that we are in the worst case situation).
-
We want to determine the time of the research function according to the size of the library. To that purpose, we will adapt the previous functions so that one can run the lecture function by reading partially the first n lines of the file, n from 1000 to 10000 (this may take a long time !)
-
One want to visualize the curves we got for the 3 functions. We will use the
gnuplot software to that purpose.
-
Are these graph corresponding to the worst case complexity of these algorithm ?
-
Do the same process (questions a to g) by representing the library as a linked list.
-
Compare the two structures.