Plot error with plotting string values

Code is executing correctly but would like to have the original and the mutated strings on the plot
% Simulate DNA sequence evolution with mutations
sequenceLength = 50;
mutationRate = 0.02;
% Generate random DNA sequence
originalSequence = randsample(['A', 'T', 'C', 'G'], sequenceLength, true);
% Introduce mutations
mutatedSequence = originalSequence;
mutationPositions = rand(1, sequenceLength) < mutationRate;
mutatedSequence(mutationPositions) = randsample(setdiff(['A', 'T', 'C', 'G'], mutatedSequence(mutationPositions)), sum(mutationPositions), true);
% Display original and mutated sequences
disp('Original Sequence:');
Original Sequence:
disp(originalSequence);
TGGGTAGCCCGATTTATTGCCAATGATTTGACCCGCCAGCCAGGGTGCGA
disp('Mutated Sequence:');
Mutated Sequence:
disp(mutatedSequence);
TGGGTAGCCCGATTTATTGCCAATGATTTGACCCGCCAGCCAGTGTGCGA
% Bayesian Inference to detect mutations
priorProbabilityMutation = 0.01; % Prior probability of mutation
likelihoodMutation = mutationRate; % Likelihood of observing a mutation
% Bayes' Theorem for each position
posteriorProbabilityMutation = zeros(1, sequenceLength);
for i = 1:sequenceLength
evidence = strcmp(originalSequence(i), mutatedSequence(i));
posteriorProbabilityMutation(i) = (likelihoodMutation * priorProbabilityMutation) / ((likelihoodMutation * priorProbabilityMutation) + (~evidence * (1 - priorProbabilityMutation)));
end
% Set a threshold for detecting mutations
threshold = 0.5;
detectedMutations = posteriorProbabilityMutation > threshold;
% Display results
disp('Posterior Probability of Mutation:');
Posterior Probability of Mutation:
disp(posteriorProbabilityMutation);
Columns 1 through 19 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 Columns 20 through 38 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 Columns 39 through 50 1.0000 1.0000 1.0000 1.0000 1.0000 0.0002 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
disp('Detected Mutations:');
Detected Mutations:
disp(detectedMutations);
Columns 1 through 49 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 Column 50 1
% Visualization
figure;
subplot(3, 1, 1);
bar(posteriorProbabilityMutation, 'b');
xlabel('Position');
ylabel('Posterior Probability of Mutation');
title('Bayesian Inference');
grid on;
subplot(3, 1, 2);
plot(originalSequence, 'bo', 'LineWidth', 2);
Error using plot
Invalid first data argument.
hold on;
plot(mutatedSequence, 'rx', 'LineWidth', 2);
scatter(find(detectedMutations), ones(1, sum(detectedMutations)), 'g', 'filled');
legend('Original Sequence', 'Mutated Sequence', 'Detected Mutations');
xlabel('Position');
ylabel('Nucleotide');
title('DNA Sequence and Mutations');
grid on;
hold off;
subplot(3, 1, 3);
stem(detectedMutations, 'r', 'LineWidth', 2);
xlabel('Position');
ylabel('Mutation Detected');
title('Mutation Detection');
grid on;

4 Comments

Note that
['A', 'T', 'C', 'G']
is just a complex and indirect way of writing
'ATCG'
What exactly is your question?
Thank you for that, I will re-write it, the question is for the second plot
plot(originalSequence, 'bo', 'LineWidth', 2);
hold on;
plot(mutatedSequence, 'rx', 'LineWidth', 2);
I would like to plot the 'ATCG' on the plot
Or if you want that text as tick marks along the axes, then you could use:

Sign in to comment.

Answers (1)

Hi Oganes,
I understand that you are facing an issue with displaying both the original and mutated DNA sequences on a plot.
An error is encountered with the 'plot' function as the character arrays representing DNA sequences are being passed, whereas the 'plot' function expects numerical inputs. To render the DNA sequences graphically via the 'plot' function, each nucleotide can be mapped to a distinct numerical value, followed by plotting the corresponding numbers.
Follow the given code snippet to proceed further,
% Convert sequences to numerical arrays for plotting
originalSequenceNum = nucleotideToNumber('AGGGTGCGA');
mutatedSequenceNum = nucleotideToNumber('AGTGTGCGA');
plot(originalSequenceNum);
hold on;
plot(mutatedSequenceNum);
legend('Original Sequence', 'Mutated Sequence');
xlabel('Position');
ylabel('Nucleotide');
title('DNA Sequence and Mutations');
set(gca, 'ytick', 1:4, 'yticklabel', {'A', 'T', 'C', 'G'});
% A mapping function for the nucleotides to numerical values:
function numArray = nucleotideToNumber(s)
map = containers.Map({'A', 'T', 'C', 'G'}, 1:4);
numArray = zeros(size(s));
for i = 1:length(s)
numArray(i) = map(s(i));
end
end
For a comprehensive understanding of the 'plot' function in MATLAB, please refer to the following documentation.
I hope it helps.

1 Comment

A simpler approach is to use CATEGORICAL arrays (which PLOT also accepts, it even labels the tickmarks correctly using the categories):
S = ["A","T","C","G"];
originalSequenceNum = categorical(num2cell('AGGGTGCGA'),S)
originalSequenceNum = 1×9 categorical array
A G G G T G C G A
mutatedSequenceNum = categorical(num2cell('AGTGTGCGA'),S)
mutatedSequenceNum = 1×9 categorical array
A G T G T G C G A
plot(originalSequenceNum);
hold on;
plot(mutatedSequenceNum);
legend('Original Sequence', 'Mutated Sequence');
xlabel('Position');
ylabel('Nucleotide');
title('DNA Sequence and Mutations');

Sign in to comment.

Categories

Asked:

on 16 Dec 2023

Commented:

on 27 Dec 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!