src/matt_ygraph.c File Reference

File that all functions used to retrieve conserved regions and select some. More...

#include <math.h>
#include "matt_ygraph.h"
#include "sequence.h"
#include "macros.h"
#include "alignment.h"
#include "options.h"
#include "io.h"
#include "carnac.h"

Include dependency graph for matt_ygraph.c:


Functions

void destroy_main_structure (p_matt_ygraph m)
 Function used to destroy all structures allocated.
int is_present_int_array (int num, int *array, int length)
 Function used to check if a number is present in an array of integers.
int is_present_char_array (char *string, char_array *array)
 Function used to check if a string is present in a char_array.
void copy_interval_query (p_interval_query interval_1, p_interval_query interval_2)
 Function used to copy two interval queries.
float score_function (float score_pos_i_minus_1, int nb_al_pos_i, float mean_al, float epsilon, int lambda)
 Defines the score function for the position i.
void calculate_nb_al_function (p_matt_ygraph m)
 Function used to calculate the number of alignments by position.
void calculate_score_function (p_matt_ygraph m)
 Function used to calculate the score function.
void get_interesting_regions (p_matt_ygraph m)
 Function used to determine interesting regions on the reference.
void determine_pos_query (p_matt_ygraph m, int index_interval, int index_al)
 Function used to determine the positions on the query sequence for the conserved region on the reference.
void complete_set_sequences (p_matt_ygraph m)
 Function used to determine the sequences of the reference sequence and all query sequences of all sets.
void get_sets_of_sequences (p_matt_ygraph m)
 Function used to determine the sets of sequences of conserved regions.
void eliminate_redundancy_set (p_matt_ygraph m, int index_set)
 Function used to eliminate redundant sequences inside the current set.
void compare_all_against_ref (p_matt_ygraph m)
 Function used to compare all sequences against the reference for all sets of sequences.
int compare_sort_id (const void *a, const void *b)
 Function used to compare two p_interval_query by their ref identity.
void sort_identity_ref (p_matt_ygraph m)
 Function used to sort all queries in each set by reference identity.
void check_species_repetitions (p_matt_ygraph m)
 Function used to check species repetitions.
void search_naive_clique (p_matt_ygraph m)
 Function used to serach the naive clique for all sets.
void recursive_search_maxi_clique (p_set_of_seqs set, p_clique temp_clique, int index)
 Function used to search recursively the maxi clique in the current set.
void maxi_clique_search (p_matt_ygraph m)
 Function used to search the maxi clique for all sets.

Detailed Description

File that all functions used to retrieve conserved regions and select some.

Author:
Benjamin Grenier-Boley <benjamin.grenier-boley@inria.fr>
Version:
1.02
Date:
September 2008
This file will define all functions that will retrieve conserved regions according to alignments and all functions used to select sequences according to users parameters

Function Documentation

void calculate_nb_al_function ( p_matt_ygraph  m  ) 

Function used to calculate the number of alignments by position.

Parameters:
[in,out] m : the main structure containing all needed information
That function will count the number of alignments by position and fill the corresponding array

References alignment_info_array::al_info, DESTROY, destroy_alignment(), eliminate_redundancy_species(), alignment_info_array::nb_al_info, NEW, and OPTS_filter_species.

Referenced by main().

void calculate_score_function ( p_matt_ygraph  m  ) 

Function used to calculate the score function.

Parameters:
[in,out] m : the main structure containing information
That function will calculate the score function used to retrieve conserved region. The score function is defined in the macros.h file

Parameters:
[in,out] m : the main structure to destroy
That function will calculate the score function used to retrieve conserved region

References OPTS_lambda, and score_function().

Referenced by main().

void check_species_repetitions ( p_matt_ygraph  m  ) 

Function used to check species repetitions.

Parameters:
[in,out] m : the main structure containing information and that will be updated
That function will check if the species repetitions threshold is respected

References char_array::chars, DESTROY, is_present_char_array(), char_array::nb_chars, NEW, OPTS_max_species_repet, OPTS_verbose, print_verbose_end_checking_repetitions_set(), print_verbose_start_checking_repetitions_set(), and RENEW.

Referenced by main().

void compare_all_against_ref ( p_matt_ygraph  m  ) 

Function used to compare all sequences against the reference for all sets of sequences.

Parameters:
[in,out] m : the main structure containing information and that will be updated
That function will compare all sequences against the reference for all sets of sequences and eliminate all queries that don't repsect the identity thresholds (min and max)

References copy_interval_query(), DESTROY, eliminate_redundancy_set(), NEW, OPTS_carnac, OPTS_max_id, OPTS_min_id, OPTS_verbose, print_verbose_end_compare_ref_set(), print_verbose_start_compare_ref_set(), RENEW, and small_in_large().

Referenced by main().

int compare_sort_id ( const void *  a,
const void *  b 
)

Function used to compare two p_interval_query by their ref identity.

Parameters:
[in] a :first query to compare
[in] b : second query to compare
Returns:
-1 if "a" has a lower id ref than "b", 0 if equal and -1 otherwise
Function used to compare two p_interval_query by their ref identity

Referenced by sort_identity_ref().

void complete_set_sequences ( p_matt_ygraph  m  ) 

Function used to determine the sequences of the reference sequence and all query sequences of all sets.

Parameters:
[in,out] m : the main structure containing all sets
That function will determine the sequences of the reference sequence and all query sequences according to positions and strand on all sets of sequences

Parameters:
[in,out] m : main structure containing all sets
That function will determine the sequences of the reference sequence and all query sequences according to positions and strand on all sets of sequences

References get_complementary_base(), and NEW.

Referenced by get_sets_of_sequences().

void copy_interval_query ( p_interval_query  interval_1,
p_interval_query  interval_2 
)

Function used to copy two interval queries.

Parameters:
[in,out] interval_1 : the interval to copy in
[in] interval_2 : the interval to copy
That function will copy all the attributes of the interavl_2 in the first one

References NEW.

Referenced by compare_all_against_ref(), and eliminate_redundancy_set().

void destroy_main_structure ( p_matt_ygraph  m  ) 

Function used to destroy all structures allocated.

Parameters:
[in,out] m : the main structure to destroy
That function will destroy all allocated memory according to mode used

References DESTROY, destroy_alignment(), destroy_sequence(), and OPTS_carnac.

Referenced by main().

void determine_pos_query ( p_matt_ygraph  m,
int  index_interval,
int  index_al 
)

Function used to determine the positions on the query sequence for the conserved region on the reference.

Parameters:
[in,out] m : the main structure containing information and that will be updated
[in] index_interval : index of the reference interval in the array of conserved region
[in] index_al : index of the alignment concerned
That function will determine the positions on the query sequence for the the conserved region on the reference. It points at the correct interval by the index_interval parameter and the correct alignment by the index_al parameter.

References MAX, and MIN.

Referenced by get_sets_of_sequences().

void eliminate_redundancy_set ( p_matt_ygraph  m,
int  index_set 
)

Function used to eliminate redundant sequences inside the current set.

Parameters:
[in,out] m : the main structure containing information and that will be updated
[in] index_set : the index of the set to process
That function will eliminate redundant sequences inside the set (same species and overlapped positions)

References copy_interval_query(), DESTROY, is_present_int_array(), NEW, and RENEW.

Referenced by compare_all_against_ref().

void get_interesting_regions ( p_matt_ygraph  m  ) 

Function used to determine interesting regions on the reference.

Parameters:
[in,out] m : the main structure containing information and that will be updated with defined regions
That function will use the scores calculated by the score function to determine positions of conserved regions on the reference sequence

References NEW, OPTS_max_length_region, OPTS_min_length_region, and RENEW.

Referenced by main().

void get_sets_of_sequences ( p_matt_ygraph  m  ) 

Function used to determine the sets of sequences of conserved regions.

Parameters:
[in,out] m : the main structure containing information and that will be updated with the sets
That function will determine the positions on query sequences according to the positions of conserved regions on the reference and the info about alignments

References complete_set_sequences(), DESTROY, determine_pos_query(), get_length_ref_intersect(), is_ref_intersect(), NEW, and RENEW.

Referenced by main().

int is_present_char_array ( char *  string,
char_array array 
)

Function used to check if a string is present in a char_array.

Parameters:
[in] string : the string to search for
[in] array : the char_array
Returns:
1 if the string was found, 0 otherwise
That function will check if a string is present in an array of strings (char_array) and return 1 if it was found and 0 in the other case

References char_array::chars, and char_array::nb_chars.

Referenced by check_species_repetitions().

int is_present_int_array ( int  num,
int *  array,
int  length 
)

Function used to check if a number is present in an array of integers.

Parameters:
[in] num : the number to search for
[in] array : the array of integers
[in] length : length of the array
Returns:
1 if the number was found, 0 otherwise
That function will check if a number is present in an array of integres and return 1 if it was found and 0 in the other case

Referenced by eliminate_redundancy_set(), and recursive_search_maxi_clique().

void maxi_clique_search ( p_matt_ygraph  m  ) 

Function used to search the maxi clique for all sets.

Parameters:
[in,out] m : the main structure containing information and that will be updated
That function will search the maxi clique for all sets. If a number of sequences has been provided by the user the maxi clique will be at least that size

Parameters:
[in,out] m : the main structure containing information and that will be updated
That function will search the maxi clique search for all sets. If a number of sequences has been provided by the user the maxi clique will be at least that size

References aln(), DESTROY, NEW, OPTS_carnac, OPTS_max_id, OPTS_min_id, OPTS_nb_seqs_set, OPTS_verbose, print_verbose_end_maxi_clique_set(), print_verbose_start_maxi_clique_set(), recursive_search_maxi_clique(), and RENEW.

Referenced by main().

void recursive_search_maxi_clique ( p_set_of_seqs  set,
p_clique  temp_clique,
int  index 
)

Function used to search recursively the maxi clique in the current set.

Parameters:
[in,out] set : the current set
[in,out] temp_clique : the temporary maxi clique
[in] index : the index in the query array of the sequence to add
That function will search the maxi clique recursively by trying to add the query sequence in the temporary maxi clique

References DESTROY, is_present_int_array(), NEW, OPTS_nb_seqs_set, recursive_search_maxi_clique(), and RENEW.

Referenced by maxi_clique_search(), and recursive_search_maxi_clique().

float score_function ( float  score_pos_i_minus_1,
int  nb_al_pos_i,
float  mean_al,
float  epsilon,
int  lambda 
)

Defines the score function for the position i.

Parameters:
[in] score_pos_i_minus_1 : the score at position i - 1
[in] nb_al_pos_i : number of alignments at position i
[in] mean_al : mean number of alignments by position
[in] epsilon : float to add at the nb_al_i parameter to avoid log(0)
[in] lambda : parameter threshold
Returns:
The score at the position i
Note:
Si = max(0,Si-1 + log(ni + espilon / lanmba A) Si : score at position i Si-1 : score at position i-1 ni : number of alignments at position i$ epsilon : float to avoid log(0) in cas ni = 0 A : mean number of alignments lambda : parameter threshold
This function will calculate the score by position

References MAX, and OPTS_lambda.

Referenced by calculate_score_function().

void search_naive_clique ( p_matt_ygraph  m  ) 

Function used to serach the naive clique for all sets.

Function used to search the naive clique for all sets.

Parameters:
[in,out] m : the main structure containing information and that will be updated
That function will search the naive clique for all sets. It will include by reference identity order each sequences and abort as soon as one cannot be included

References aln(), DESTROY, NEW, OPTS_carnac, OPTS_max_id, OPTS_min_id, OPTS_nb_seqs_set, OPTS_verbose, print_verbose_end_naive_clique_set(), print_verbose_no_connect(), print_verbose_start_naive_clique_set(), and RENEW.

Referenced by main().

void sort_identity_ref ( p_matt_ygraph  m  ) 

Function used to sort all queries in each set by reference identity.

Parameters:
[in,out] m : the main structure containing information and that will be updated
Function used to sort all queries in each set by reference identity

References compare_sort_id().

Referenced by main().


Generated on Mon Sep 22 16:34:10 2008 for matt_ygraph by  doxygen 1.5.5