DNA to Code: Mastering the Longest Common Subsequence Algorithm.
Aug 23, 2024
131 views
Discussion (0)
Written by Array Alchemist
LeetCode wizard with 75 conquered challenges under my belt.
Turning algorithm nightmares into "Aha!" moments, one explanation at a time.
Join me to level up your coding skills – from interviews to real-world application!
Before we solve this problem let's understand why this algorithm i.e longest common subsequence is important?
To address Longest Common Subsequence i will use LCS interchangeably.
Let's take an example of DNA:
We have two DNA sequences,
The LCS function calculates the length of the longest common subsequence between these two DNA sequences.
const humanDNA = "ATGCGATCGTAGCTAGCTAGCTGATCG";
const chimpanzeeDNA = "ATGCGATCGTAGCTAGCTAGCTGATCGATCG"
Running LCS algorithm on DNAs data give us following result:
In practice, this algorithm could be used in other areas such as:
This is an example of why learning LCS is important.
Now that we understand why we need to understand this algorithm, let's see how we can solve this question?
Question:
Given two strings text1 and text2, return the length of their longest common subsequence. If there is no common subsequence, return 0.
A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.
For example, "ace" is a subsequence of "abcde".
A common subsequence of two strings is a subsequence that is common to both strings.
Example 1:
Example 2:
Example 3:
The comparison table:
another comaparison:
if we submit the code to leetcode, it might take alot of time to calculate for long strings.
As seen in the image below. For each leaf of the tree node, even though the values are same, we will recompute it for all, eventually taking more time. If we have a limit, we will get following error.
Above code is brute force way of calculating each branch of the tree, even though we have calculated before. To optimise any dynamic programming problem we use memoization technique.
LCS DP Table Filling Example:
Dp is short form of dynamic programming.
Let's use an array i.e [2 *2] dp.
Technique: We're using a 2D array to store the length of the LCS for each combination of prefixes of the two strings. We fill this table iteratively.
Step-by-step process:
The bottom-right cell (3) gives the length of the LCS. The LCS is "acd".
thus this can be written as:
Complexities:
Time Complexity: O(N * M)
Space Complexity: O(N * M)
after submitting to leetcode: