diff --git a/docs/ai/llm/llm-building.md b/docs/ai/llm/llm-building.md index fa1945482ac..af8280f81e2 100644 --- a/docs/ai/llm/llm-building.md +++ b/docs/ai/llm/llm-building.md @@ -144,7 +144,7 @@ The key distinction lies in their approaches: LangChain prioritizes customizatio No matter what AI framework you pick, I always recommend using a robust data platform like SingleStore that supports not just vector storage but also hybrid search, low latency, fast data ingestion, all data types, AI frameworks integration, and much more. -![](../../media/Pasted%20image%2020241118181518.jpg) +![image](../../media/Pasted%20image%2020241118181518.jpg) [A Beginner’s Guide to Building LLM-Powered Applications with LangChain! - DEV Community](https://dev.to/pavanbelagatti/a-beginners-guide-to-building-llm-powered-applications-with-langchain-2d6e) diff --git a/docs/computer-science/interview-question/system-design-uber-data-architecture.md b/docs/computer-science/interview-question/system-design-uber-data-architecture.md index 93eedc1d9a8..2805f9fceee 100644 --- a/docs/computer-science/interview-question/system-design-uber-data-architecture.md +++ b/docs/computer-science/interview-question/system-design-uber-data-architecture.md @@ -40,7 +40,7 @@ Uber's real-time data infrastructure is powered by a combination of advanced ope The diagram below shows the overall landscape. -![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e74a5c9-a041-4657-a3e4-39017b238e76_1600x1017.png) +![image](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e74a5c9-a041-4657-a3e4-39017b238e76_1600x1017.png) Let’s take a closer look at the key technologies Uber relies on, how they work, and the unique tweaks that make them fit Uber's requirements. @@ -50,7 +50,7 @@ Kafka is the backbone of Uber’s data streaming. It handles trillions of messages and petabytes of data daily, helping to transport information from user apps (like driver and rider apps) and microservices. Kafka’s key role is to move this streaming data to batch and real-time systems. -![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35bab385-a2ed-4c4f-958d-66e20e5d269b_1600x813.png) +![image](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35bab385-a2ed-4c4f-958d-66e20e5d269b_1600x813.png) At Uber, Kafka was heavily customized to meet its large-scale needs. Some of the key features are as follows: @@ -75,7 +75,7 @@ By implementing these changes, Uber has made Flink more reliable and easier to u See the diagram below that shows the Unified Flink Architecture at Uber. -![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e8a845-940c-468d-a19c-f39f1a8cc4b4_1600x1017.png) +![image](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9e8a845-940c-468d-a19c-f39f1a8cc4b4_1600x1017.png) ### Apache Pinot for Real-Time OLAP @@ -167,7 +167,7 @@ For example, surge pricing calculations, which depend on real-time supply and de See the diagram below: -![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f0c703-4ef5-4a6e-bc5e-82c3a6c86db6_1600x1141.png) +![image](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00f0c703-4ef5-4a6e-bc5e-82c3a6c86db6_1600x1141.png) This setup requires careful synchronization of data between regions. Uber uses uReplicator, a tool they developed to replicate Kafka messages across clusters, ensuring the system remains redundant and reliable. Even if one region goes down, the data is preserved and can be quickly restored in the backup region, minimizing disruption to the service. @@ -181,7 +181,7 @@ If the primary region fails, the system fails over to a backup (passive) region, See the diagram below that shows the Active-Passive setup. -![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bd81bc0-b086-4fa9-bde0-b16c1fe32634_1600x961.png) +![image](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bd81bc0-b086-4fa9-bde0-b16c1fe32634_1600x961.png) The key challenge in Active-Passive setups is offset synchronization—ensuring that the consumer in the backup region starts processing from the same point as the primary region. diff --git a/docs/courses/course-time-series-analysis/intro-time-series.md b/docs/courses/course-time-series-analysis/intro-time-series.md index b69dbe4b8f9..de869b62690 100755 --- a/docs/courses/course-time-series-analysis/intro-time-series.md +++ b/docs/courses/course-time-series-analysis/intro-time-series.md @@ -139,7 +139,7 @@ Arbitrage - Buy and sell commodities and make a safe profit, while the price adj In the most intuitive sense, stationarity means that the statistical properties of a process generating a time series do not change over time. It does not mean that the series does not change over time, just that the way it changes does not itself change over time. The algebraic equivalent is thus a linear function, perhaps, and not a constant one; the value of a linear function changes as 𝒙 grows, but the way it changes remains constant - it has a constant slope; one value that captures that rate of change. -![](../../media/Pasted%20image%2020241011132306.png) +![image](../../media/Pasted%20image%2020241011132306.png) Figure 1: Time series generated by a stationary (top) and a non-stationary (bottom) processes. diff --git a/docs/databases/data-warehouses/concepts/data-engineering.md b/docs/databases/data-warehouses/concepts/data-engineering.md index 14ae3f7cf8e..4c28931938d 100644 --- a/docs/databases/data-warehouses/concepts/data-engineering.md +++ b/docs/databases/data-warehouses/concepts/data-engineering.md @@ -106,7 +106,7 @@ ## State of Data Engineering 2024 -![](../../../media/Screenshot%202024-07-15%20at%2012.16.36%20AM.jpg) +![image](../../../media/Screenshot%202024-07-15%20at%2012.16.36%20AM.jpg) [State of Data Engineering 2024](https://8040338.fs1.hubspotusercontent-na1.net/hubfs/8040338/lakeFS%20State%20of%20Data%20Engineering%202024.pdf) diff --git a/docs/databases/nosql-databases/redis/redis-data-types.md b/docs/databases/nosql-databases/redis/redis-data-types.md index 847a27577c0..d5ff76d0b7c 100755 --- a/docs/databases/nosql-databases/redis/redis-data-types.md +++ b/docs/databases/nosql-databases/redis/redis-data-types.md @@ -14,8 +14,11 @@ Redis is not aplainkey-value store, it is actually adata structures server, supp ## Redis Lists Redis lists are implemented via Linked Lists. This means that even if you have millions of elements inside a list, the operation of adding a new element in the head or in the tail of the list is performedin constant time. The speed of adding a new element with the [LPUSH](https://redis.io/commands/lpush) command to the head of a list with ten elements is the same as adding an element to the head of list with 10 million elements. + Redis Lists are implemented with linked lists because for a database system it is crucial to be able to add elements to a very long list in a very fast way. Redis Lists can be taken at constant length in constant time. + When fast access to the middle of a large collection of elements is important, there is a different data structure that can be used, called sorted sets. Sorted sets will be covered later in this tutorial. + The [LPUSH](https://redis.io/commands/lpush) command adds a new element into a list, on the left (at the head), while the [RPUSH](https://redis.io/commands/rpush) command adds a new element into a list, on the right (at the tail). Finally the [LRANGE](https://redis.io/commands/lrange) command extracts ranges of elements from lists: ```bash @@ -25,6 +28,7 @@ The [LPUSH](https://redis.io/commands/lpush) command adds a new element into a l ``` Note that [LRANGE](https://redis.io/commands/lrange) takes two indexes, the first and the last element of the range to return. Both the indexes can be negative, telling Redis to start counting from the end: so -1 is the last element, -2 is the penultimate element of the list, and so forth. + An important operation defined on Redis lists is the ability topop elements. Popping elements is the operation of both retrieving the element from the list, and eliminating it from the list, at the same time. You can pop elements from left and right, similarly to how you can push elements in both sides of the list: ```bash @@ -41,22 +45,27 @@ An important operation defined on Redis lists is the ability topop elements. Pop ## Capped Lists Redis allows us to use lists as a capped collections, only remembering the latest N items and discarding all the oldest items using the [LTRIM](https://redis.io/commands/ltrim) command. + The [LTRIM](https://redis.io/commands/ltrim) command is similar to [LRANGE](https://redis.io/commands/lrange), butinstead of displaying the specified range of elementsit sets this range as the new list value. All the elements outside the given range are removed. + Note: while [LRANGE](https://redis.io/commands/lrange) is technically anO(N) command, accessing small ranges towards the head or the tail of the list is a constant time operation. ## Blocking operations on lists Lists have a special feature that make them suitable to implement queues, and in general as a building block for inter process communication systems: blocking operations. + Imagine you want to push items into a list with one process, and use a different process in order to actually do some kind of work with those items. This is the usual producer / consumer setup, and can be implemented in the following simple way: - To push items into the list, producers call [LPUSH](https://redis.io/commands/lpush). - To extract / process items from the list, consumers call [RPOP](https://redis.io/commands/rpop). + However it is possible that sometimes the list is empty and there is nothing to process, so [RPOP](https://redis.io/commands/rpop) just returns NULL. In this case a consumer is forced to wait some time and retry again with [RPOP](https://redis.io/commands/rpop). This is calledpolling, and is not a good idea in this context because it has several drawbacks: 1. Forces Redis and clients to process useless commands (all the requests when the list is empty will get no actual work done, they'll just return NULL). - 2. Adds a delay to the processing of items, since after a worker receives a NULL, it waits some time. To make the delay smaller, we could wait less between calls to [RPOP](https://redis.io/commands/rpop), with the effect of amplifying problem number 1, i.e. more useless calls to Redis. + So Redis implements commands called [BRPOP](https://redis.io/commands/brpop) and [BLPOP](https://redis.io/commands/blpop) which are versions of [RPOP](https://redis.io/commands/rpop) and [LPOP](https://redis.io/commands/lpop) able to block if the list is empty: they'll return to the caller only when a new element is added to the list, or when a user-specified timeout is reached. + This is an example of a [BRPOP](https://redis.io/commands/brpop) call we could use in the worker: ```bash @@ -67,14 +76,15 @@ This is an example of a [BRPOP](https://redis.io/commands/brpop) call we could u ``` It means: "wait for elements in the listtasks, but return if after 5 seconds no element is available". + Note that you can use 0 as timeout to wait for elements forever, and you can also specify multiple lists and not just one, in order to wait on multiple lists at the same time, and get notified when the first list receives an element. + A few things to note about [BRPOP](https://redis.io/commands/brpop): 1. Clients are served in an ordered way: the first client that blocked waiting for a list, is served first when an element is pushed by some other client, and so forth. - 2. The return value is different compared to [RPOP](https://redis.io/commands/rpop): it is a two-element array since it also includes the name of the key, because [BRPOP](https://redis.io/commands/brpop) and [BLPOP](https://redis.io/commands/blpop) are able to block waiting for elements from multiple lists. - 3. If the timeout is reached, NULL is returned. + There are more things you should know about lists and blocking ops. We suggest that you read more on the following: - It is possible to build safer queues or rotating queues using [RPOPLPUSH](https://redis.io/commands/rpoplpush). @@ -83,13 +93,13 @@ There are more things you should know about lists and blocking ops. We suggest t ## Automatic creation and removal of keys So far in our examples we never had to create empty lists before pushing elements, or removing empty lists when they no longer have elements inside. It is Redis' responsibility to delete keys when lists are left empty, or to create an empty list if the key does not exist and we are trying to add elements to it, for example, with [LPUSH](https://redis.io/commands/lpush). + This is not specific to lists, it applies to all the Redis data types composed of multiple elements -- Streams, Sets, Sorted Sets and Hashes. + Basically we can summarize the behavior with three rules: 1. When we add an element to an aggregate data type, if the target key does not exist, an empty aggregate data type is created before adding the element. - 2. When we remove elements from an aggregate data type, if the value remains empty, the key is automatically destroyed. The Stream data type is the only exception to this rule. - 3. Calling a read-only command such as [LLEN](https://redis.io/commands/llen)(which returns the length of the list), or a write command removing elements, with an empty key, always produces the same result as if the key is holding an empty aggregate type of the type the command expects to find. ## Redis Hashes @@ -114,6 +124,7 @@ OK ``` While hashes are handy to representobjects, actually the number of fields you can put inside a hash has no practical limits (other than available memory) + The command [HMSET](https://redis.io/commands/hmset) sets multiple fields of the hash, while [HGET](https://redis.io/commands/hget) retrieves a single field.[HMGET](https://redis.io/commands/hmget) is similar to [HGET](https://redis.io/commands/hget) but returns an array of values: ```bash @@ -149,8 +160,10 @@ Redis Sets are unordered collections of strings. The [SADD](https://redis.io/com 2. 1 3. 2 -> sscan myset 0 match f* +`> sscan myset 0 match f*` + Here I've added three elements to my set and told Redis to return all the elements. As you can see they are not sorted -- Redis is free to return the elements in any order at every call, since there is no contract with the user about element ordering. + Redis has commands to test for membership. For example, checking if an element exists: ```bash @@ -161,13 +174,19 @@ Redis has commands to test for membership. For example, checking if an element e ``` "3" is a member of the set, while "30" is not. + Sets are good for expressing relations between objects. For instance we can easily use sets in order to implement tags. + There are other non trivial operations that are still easy to implement using the right Redis commands. For instance we may want a list of all the objects with the tags 1, 2, 10, and 27 together. We can do this using the [SINTER](https://redis.io/commands/sinter) command, which performs the intersection between different sets. + In addition to intersection you can also perform unions, difference, extract a random element, and so forth. The command to extract an element is called [SPOP](https://redis.io/commands/spop), and is handy to model certain problems. + set command that provides the number of elements inside a set. This is often called thecardinality of a setin the context of set theory, so the Redis command is called [SCARD](https://redis.io/commands/scard). + When you need to just get random elements without removing them from the set, there is the [SRANDMEMBER](https://redis.io/commands/srandmember) command suitable for the task. It also features the ability to return both repeating and non-repeating elements. + | **Command** | **Example use and description** | |---------------|---------------------------------------------------------| | SADD | SADD key-name item [item ...]--- Adds the items to the set and returns the number of items added that weren't already present | @@ -178,7 +197,8 @@ When you need to just get random elements without removing them from the set, th | SRANDMEMBER | SRANDMEMBER key-name [count]--- Returns one or more random items from the SET. When count is positive, Redis will return count distinct randomly chosen items, and when count is negative, Redis will return count randomly chosen items that may not be distinct. | | SPOP | SPOP key-name - Removes and returns a random item from the SET | | SMOVE | SMOVE source-key dest-key item - If the item is in the source, removes the item from the source and adds it to the destination, returning if the item was moved | -Operations for combining and manipulatingSETs in Redis + +Operations for combining and manipulating SETs in Redis | **Command** | **Example use and description** | |-------------|-----------------------------------------------------------| @@ -192,11 +212,14 @@ Operations for combining and manipulatingSETs in Redis ## Redis Sorted sets (ZSET) Sorted sets are a data type which is similar to a mix between a Set and a Hash. Like sets, sorted sets are composed of unique, non-repeating string elements, so in some sense a sorted set is a set as well. + However while elements inside sets are not ordered, every element in a sorted set is associated with a floating point value, calledthe score(this is why the type is also similar to a hash, since every element is mapped to a value). + Moreover, elements in a sorted sets aretaken in order(so they are not ordered on request, order is a peculiarity of the data structure used to represent sorted sets). They are ordered according to the following rule: - If A and B are two elements with a different score, then A > B if A.score is > B.score. - If A and B have exactly the same score, then A > B if the A string is lexicographically greater than the B string. A and B strings can't be equal since sorted sets only have unique elements. + Implementation note: Sorted sets are implemented via a dual-ported data structure containing both a skip list and a hash table, so every time we add an element Redis performs anO(log(N))operation. That's good, but when we ask for sorted elements Redis does not have to do any work at all, it's already all sorted ```bash @@ -222,8 +245,9 @@ Implementation note: Sorted sets are implemented via a dual-ported data structur | ZADD | Adds member with the given score to the ZSET | | ZRANGE | Fetches the items in the ZSET from their positions in sorted order | | ZRANGEBYSCORE | Fetches items in the ZSET based on a range of scores | -| ZREM | Removes the item from the ZSET, if it exists |- Expire an item in list after 2 mins +| ZREM | Removes the item from the ZSET, if it exists | +- Expire an item in list after 2 mins - Set expire on item if no new data comes then expire the whole key - When inserting new data remove the data before 2 mins if available @@ -242,11 +266,13 @@ zremrangebyscore test_key 1588056950 1588056957 (to remove item) ## Lexicographical scores With recent versions of Redis 2.8, a new feature was introduced that allows getting ranges lexicographically, assuming elements in a sorted set are all inserted with the same identical score (elements are compared with the Cmemcmpfunction, so it is guaranteed that there is no collation, and every Redis instance will reply with the same output). + The main commands to operate with lexicographical ranges are [ZRANGEBYLEX](https://redis.io/commands/zrangebylex), [ZREVRANGEBYLEX](https://redis.io/commands/zrevrangebylex), [ZREMRANGEBYLEX](https://redis.io/commands/zremrangebylex) and [ZLEXCOUNT](https://redis.io/commands/zlexcount). ## Updating the score: leader boards Just a final note about sorted sets before switching to the next topic. Sorted sets' scores can be updated at any time. Just calling [ZADD](https://redis.io/commands/zadd) against an element already included in the sorted set will update its score (and position) withO(log(N)) time complexity. As such, sorted sets are suitable when there are tons of updates. + Because of this characteristic a common use case is leader boards. The typical application is a Facebook game where you combine the ability to take users sorted by their high score, plus the get-rank operation, in order to show the top-N users, and the user rank in the leader board (e.g., "you are the #4932 best score here"). ## Bitmaps @@ -254,8 +280,11 @@ Because of this characteristic a common use case is leader boards. The typical a Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type. Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 232different bits. Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and operations on groups of bits, for example counting the number of set bits in a given range of bits (e.g., population counting). + Since bitmap operations don't have a data structure of their own, there isn't a special data structure to describe. The Redis strings themselves are implemented as a binary safe string. Redis string data structure is internally called Simple Dynamic String (SDS). It is essentially a nativechar []with some additional book keeping information. + One of the biggest advantages of bitmaps is that they often provide extreme space savings when storing information. For example in a system where different users are represented by incremental user IDs, it is possible to remember a single bit information (for example, knowing whether a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory. + Bits are set and retrieved using the [SETBIT](https://redis.io/commands/setbit) and [GETBIT](https://redis.io/commands/getbit) commands: ```bash @@ -268,14 +297,15 @@ Bits are set and retrieved using the [SETBIT](https://redis.io/commands/setbit) ``` The [SETBIT](https://redis.io/commands/setbit) command takes as its first argument the bit number, and as its second argument the value to set the bit to, which is 1 or 0. The command automatically enlarges the string if the addressed bit is outside the current string length. + [GETBIT](https://redis.io/commands/getbit) just returns the value of the bit at the specified index. Out of range bits (addressing a bit that is outside the length of the string stored into the target key) are always considered to be zero. + There are three commands operating on group of bits: 1. [BITOP](https://redis.io/commands/bitop) performs bit-wise operations between different strings. The provided operations are AND, OR, XOR and NOT. - 2. [BITCOUNT](https://redis.io/commands/bitcount) performs population counting, reporting the number of bits set to 1. - 3. [BITPOS](https://redis.io/commands/bitpos) finds the first bit having the specified value of 0 or 1. + Both [BITPOS](https://redis.io/commands/bitpos) and [BITCOUNT](https://redis.io/commands/bitcount) are able to operate with byte ranges of the string, instead of running for the whole length of the string. The following is a trivial example of [BITCOUNT](https://redis.io/commands/bitcount) call: ```bash @@ -291,7 +321,9 @@ Common use cases for bitmaps are: - Real time analytics of all kinds. - Storing space efficient but high performance boolean information associated with object IDs. + For example imagine you want to know the longest streak of daily visits of your web site users. You start counting days starting from zero, that is the day you made your web site public, and set a bit with [SETBIT](https://redis.io/commands/setbit) every time the user visits the web site. As a bit index you simply take the current unix time, subtract the initial offset, and divide by the number of seconds in a day (normally, 3600*24). + This way for each user you have a small string containing the visit information for each day. With [BITCOUNT](https://redis.io/commands/bitcount) it is possible to easily get the number of days a given user visited the web site, while with a few [BITPOS](https://redis.io/commands/bitpos) calls, or simply fetching and analyzing the bitmap client-side, it is possible to easily compute the longest streak. ```bash @@ -309,8 +341,11 @@ Bitmaps are trivial to split into multiple keys, for example for the sake of sha ## HyperLogLogs A HyperLogLog is a probabilistic data structure used in order to count unique things (technically this is referred to estimating the cardinality of a set). Usually counting unique items requires using an amount of memory proportional to the number of items you want to count, because you need to remember the elements you have already seen in the past in order to avoid counting them multiple times. However there is a set of algorithms that trade memory for precision: you end with an estimated measure with a standard error, which in the case of the Redis implementation is less than 1%. The magic of this algorithm is that you no longer need to use an amount of memory proportional to the number of items counted, and instead can use a constant amount of memory! 12k bytes in the worst case, or a lot less if your HyperLogLog (We'll just call them HLL from now) has seen very few elements. + HLLs in Redis, while technically a different data structure, are encoded as a Redis string, so you can call [GET](https://redis.io/commands/get) to serialize a HLL, and [SET](https://redis.io/commands/set) to deserialize it back to the server. + Conceptually the HLL API is like using Sets to do the same task. You would [SADD](https://redis.io/commands/sadd) every observed element into a set, and would use [SCARD](https://redis.io/commands/scard) to check the number of elements inside the set, which are unique since [SADD](https://redis.io/commands/sadd) will not re-add an existing element. + While you don't reallyadd itemsinto an HLL, because the data structure only contains a state that does not include actual elements, the API is the same: - Every time you see a new element, you add it to the count with [PFADD](https://redis.io/commands/pfadd). @@ -324,6 +359,7 @@ While you don't reallyadd itemsinto an HLL, because the data structure only cont ``` An example of use case for this data structure is counting unique queries performed by users in a search form every day. + Redis is also able to perform the union of HLLs ## Other Features diff --git a/docs/databases/sql-databases/aws-aurora/high-availability-ha-others.md b/docs/databases/sql-databases/aws-aurora/high-availability-ha-others.md index 13e635d3f27..8476f242cba 100644 --- a/docs/databases/sql-databases/aws-aurora/high-availability-ha-others.md +++ b/docs/databases/sql-databases/aws-aurora/high-availability-ha-others.md @@ -43,7 +43,7 @@ MySQL NDB Cluster also protects against the estimated 30% of downtime resulting ### Master with Active Master (Circular Replication) -![](https://severalnines.com/wp-content/uploads/2022/05/05-mysql-rep-wp.jpeg) +![image](https://severalnines.com/wp-content/uploads/2022/05/05-mysql-rep-wp.jpeg) Also known as ring topology, this setup requires two or more MySQL servers which act as master. All masters receive writes and generate binlogs with a few caveats: diff --git a/docs/databases/sql-databases/postgres/replication.md b/docs/databases/sql-databases/postgres/replication.md index f78267956d9..6cac4e678ec 100644 --- a/docs/databases/sql-databases/postgres/replication.md +++ b/docs/databases/sql-databases/postgres/replication.md @@ -62,7 +62,7 @@ The publisher- and subscriber-based [logical replication feature](https://www.p Multi-Source Replication enables a replication slave to receive transactions from multiple sources simultaneously. Multi-source replication can be used to backup multiple servers to a single server, to merge table shards, and consolidate data from multiple servers to a single server. -![](https://severalnines.com/wp-content/uploads/2022/05/07-mysql-rep-wp.jpeg) +![image](https://severalnines.com/wp-content/uploads/2022/05/07-mysql-rep-wp.jpeg) MySQL and MariaDB have different implementations of multi-source replication, where MariaDB must have GTID with gtid-domain-id configured to distinguish the originating transactions while MySQL uses a separate replication channel for each master the slave replicates from. In MySQL, masters in a multi-source replication topology can be configured to use either global transaction identifier (GTID) based replication, or binary log position-based replication. diff --git a/docs/devops/others/devtron.md b/docs/devops/others/devtron.md index d6724c31d65..8f6393f9b21 100644 --- a/docs/devops/others/devtron.md +++ b/docs/devops/others/devtron.md @@ -36,7 +36,7 @@ Devtron is designed to be modular, and its functionality can be easily extended #### Architecture -![](https://github.com/devtron-labs/devtron/raw/main/assets/Architecture.jpg) +![image](https://github.com/devtron-labs/devtron/raw/main/assets/Architecture.jpg) [Devtron | A Software Platform for Kubernetes Application Management](https://devtron.ai/) diff --git a/docs/economics/mutual-funds/debt-mutual-funds.md b/docs/economics/mutual-funds/debt-mutual-funds.md index fda231a0dd1..5b1167055a0 100755 --- a/docs/economics/mutual-funds/debt-mutual-funds.md +++ b/docs/economics/mutual-funds/debt-mutual-funds.md @@ -126,6 +126,7 @@ https://freefincal.com/why-i-partially-switched-from-icici-multi-asset-fund-to-i | ----------------------------------------------------------------------------------------------- | ----------------------------------- | --------------- | | Equities & Equity related instruments | 0-100 | Very High | | Debt securities & Money Market instruments including Units of Debt oriented mutual fund schemes | 0-100 | Low to Moderate | + The fund will predominantly invest in debt instruments and endeavour to maintain equity allocation between 35% and 65% (some of it will be hedged via approved derivative instruments as permitted by SEBI from time to time) It is a credible and tax-efficient* alternative to certain fixed income instruments (like bank fixed deposits), offering the scope to earn income along with the prospect of growth in Net Asset Value (NAV) when held for a reasonably long period. diff --git a/docs/languages/sql/sql-views.md b/docs/languages/sql/sql-views.md index a064d3697e0..1bf6911cde6 100755 --- a/docs/languages/sql/sql-views.md +++ b/docs/languages/sql/sql-views.md @@ -18,14 +18,14 @@ The view has primarily two purposes: - **Inline View:** A view based on a subquery in FROM Clause, that subquery creates a temporary table and simplifies the complex query. - **Materialized View:** A view that stores the definition as well as data. It creates replicas of data by storing it physically. MySQL does not provide Materialized Views by itself -| **Key** | **Views** | **Materialized Views** | -| --- | --- | --- | -| **Definition** | Technically, the View of a table is a logical virtual copy of the table created by the "select query", but the result is not stored anywhere in the disk. | Whenever we need the data, we need to fire the query. So, the user always gets the updated or latest data from the original tables. | Materialized views are also the logical virtual copy of data−driven by the "select query", but the result of the query will get stored in the table or disk. | -| **Storage** | In Views the resulting tuples of the query expression is not get storing on the disk only the query expression is stored on the disk. | In case of Materialized views both query expression and resulting tuples of the query get stored on the disk. | -| **Query Execution** | The query expression is stored on the disk and not its result, so the query expression gets executed every time when the user tries to fetch the data from it so that the user will get the latest updated value every time. | The result of the query gets stored on the disk and hence the query expression does not get executed every time when user try to fetch the data so that user will not get the latest updated value if it get changed in database. | -| **Cost Effective** | As Views does not have any storage cost associated with it so they also does not have any update cost associated with it. | Materialized Views have a storage cost associated with it so also have update cost associated with it. | -| **Design** | Views in SQL are designed with a fixed architecture approach due to which there is an SQL standard of defining a view. | Materialized Views in SQL are designed with a generic architecture approach, so there is no SQL standard for defining it, and its functionality is provided by some databases systems as an extension. | -| **Usage** | Views are generally used when data is to be accessed infrequently and data in table get updated on frequent basis. | Materialized Views are used when data is to be accessed frequently and data in table not get updated on frequent basis. | +| **Key** | **Views** | **Materialized Views** | +| ------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| **Definition** | Technically, the View of a table is a logical virtual copy of the table created by the "select query", but the result is not stored anywhere in the disk. | Whenever we need the data, we need to fire the query. So, the user always gets the updated or latest data from the original tables. Materialized views are also the logical virtual copy of data−driven by the "select query", but the result of the query will get stored in the table or disk. | +| **Storage** | In Views the resulting tuples of the query expression is not get storing on the disk only the query expression is stored on the disk. | In case of Materialized views both query expression and resulting tuples of the query get stored on the disk. | +| **Query Execution** | The query expression is stored on the disk and not its result, so the query expression gets executed every time when the user tries to fetch the data from it so that the user will get the latest updated value every time. | The result of the query gets stored on the disk and hence the query expression does not get executed every time when user try to fetch the data so that user will not get the latest updated value if it get changed in database. | +| **Cost Effective** | As Views does not have any storage cost associated with it so they also does not have any update cost associated with it. | Materialized Views have a storage cost associated with it so also have update cost associated with it. | +| **Design** | Views in SQL are designed with a fixed architecture approach due to which there is an SQL standard of defining a view. | Materialized Views in SQL are designed with a generic architecture approach, so there is no SQL standard for defining it, and its functionality is provided by some databases systems as an extension. | +| **Usage** | Views are generally used when data is to be accessed infrequently and data in table get updated on frequent basis. | Materialized Views are used when data is to be accessed frequently and data in table not get updated on frequent basis. | [Difference between Views and Materialized Views in SQL](https://www.tutorialspoint.com/difference-between-views-and-materialized-views-in-sql) diff --git a/docs/mathematics/general/discrete-mathematics.md b/docs/mathematics/general/discrete-mathematics.md index 6fc9d40488e..6341f360ce5 100755 --- a/docs/mathematics/general/discrete-mathematics.md +++ b/docs/mathematics/general/discrete-mathematics.md @@ -12,11 +12,11 @@ A recurrence relation is an equation that recursively defines a sequence where t Example − `Fibonacci series − Fn = Fn−1 + Fn−2 Fn = Fn−1 + Fn−2, Tower of Hanoi − Fn = 2 Fn − 1 + 1` -1. **Linear Recurrence Relations** +### 1. **Linear Recurrence Relations** A linear recurrence equation of degree k or order k is a recurrence equation which is in the formatxn=A1xn−1+A2xn−1+A3xn−1+...Akxn−k (Anis a constant andAk≠0) on a sequence of numbers as a first-degree polynomial. -2. **Non-Homogeneous Recurrence Relation** +### 2. **Non-Homogeneous Recurrence Relation** A recurrence relation is called non-homogeneous if it is in the form diff --git a/docs/mathematics/statistics/estimation-statistics.md b/docs/mathematics/statistics/estimation-statistics.md index b3d6c7cd9f6..0116a3689f0 100755 --- a/docs/mathematics/statistics/estimation-statistics.md +++ b/docs/mathematics/statistics/estimation-statistics.md @@ -1,19 +1,20 @@ # Estimation Statistics -Estimation statistics may be used as an alternative to statistical hypothesis tests. Statistical hypothesis tests can be used to indicate whether the difference between two samples is due to random chance, but cannot comment on the size of the difference. A group of methods referred to as new statistics are seeing increased use instead of or in addition to p-values in order to quantify the magnitude of effects and the amount of uncertainty for estimated values. This group of statistical methods is referred to as estimation statistics. Estimation statistics is a term to describe three main classes of methods. The three main classes of methods include: +Estimation statistics may be used as an alternative to statistical hypothesis tests. Statistical hypothesis tests can be used to indicate whether the difference between two samples is due to random chance, but cannot comment on the size of the difference. A group of methods referred to as new statistics are seeing increased use instead of or in addition to p-values in order to quantify the magnitude of effects and the amount of uncertainty for estimated values. This group of statistical methods is referred to as estimation statistics. Estimation statistics is a term to describe three main classes of methods. -1. Effect Size. Methods for quantifying the size of an effect given a treatment or intervention. +### The three main classes of methods include -2. Interval Estimation. Methods for quantifying the amount of uncertainty in a value. +1. Effect Size - Methods for quantifying the size of an effect given a treatment or intervention. +2. Interval Estimation - Methods for quantifying the amount of uncertainty in a value. +3. Meta-Analysis - Methods for quantifying the findings across multiple similar studies. +Of the three, perhaps the most useful methods in applied machine learning are interval estimation. -3. Meta-Analysis. Methods for quantifying the findings across multiple similar studies. -Of the three, perhaps the most useful methods in applied machine learning are interval estimation. There are three main types of intervals. They are: +### There are three main types of intervals -1. Tolerance Interval: The bounds or coverage of a proportion of a distribution with a specific level of confidence. +1. Tolerance Interval - The bounds or coverage of a proportion of a distribution with a specific level of confidence. +2. Confidence Interval - The bounds on the estimate of a population parameter. +3. Prediction Interval - The bounds on a single observation. -2. Confidence Interval: The bounds on the estimate of a population parameter. - -3. Prediction Interval: The bounds on a single observation. A simple way to calculate a confidence interval for a classification algorithm is to calculate the binomial proportion confidence interval, which can provide an interval around a model's estimated accuracy or error. This can be implemented in Python using the confint() Statsmodels function. The function takes the count of successes (or failures), the total number of trials, and the significance level as arguments and returns the lower and upper bound of the confidence interval. The example below demonstrates this function in a hypothetical case where a model made 88 correct predictions out of a dataset with 100 instances and we are interested in the 95% confidence interval (provided to the function as a significance of 0.05). ## calculate the confidence interval diff --git a/docs/readme.md b/docs/readme.md index 6837f636084..1ef2c28666b 100755 --- a/docs/readme.md +++ b/docs/readme.md @@ -2,7 +2,7 @@ slug: / --- -# Deep Notes +# Deepak's Personal Wiki This is my personal wiki where I share everything I know about this world in form of an online wiki. @@ -67,15 +67,15 @@ loc -------------------------------------------------------------------------------- Language Files Lines Blank Comment Code -------------------------------------------------------------------------------- - Markdown 2294 250112 78577 0 171535 - JSON 3 16782 0 0 16782 - JavaScript 3 241 26 87 128 + Markdown 2433 281547 84012 0 197535 + JSON 3 20089 0 0 20089 + JavaScript 3 247 26 87 134 + YAML 1 50 5 22 23 CSS 1 30 2 7 21 - YAML 1 49 5 23 21 Plain Text 2 3 0 0 3 Bourne Shell 1 3 0 1 2 -------------------------------------------------------------------------------- - Total 2305 267220 78610 118 188492 + Total 2444 301969 84045 117 217807 -------------------------------------------------------------------------------- ``` diff --git a/docusaurus.config.js b/docusaurus.config.js index 1b555583908..f03b4a5fdcd 100755 --- a/docusaurus.config.js +++ b/docusaurus.config.js @@ -108,10 +108,15 @@ const config = { }, items: [ { - href: 'https://github.com/deepaksood619/deepaksood619.github.io', + href: 'https://github.com/deepaksood619', label: 'GitHub', position: 'right', }, + { + href: 'https://www.linkedin.com/in/deepaksood619/', + label: 'LinkedIn', + position: 'right', + }, ], }, colorMode: { diff --git a/notes-visualized-zoom.jpg b/notes-visualized-zoom.jpg old mode 100755 new mode 100644 index 454a7455be3..5173f2ceca6 Binary files a/notes-visualized-zoom.jpg and b/notes-visualized-zoom.jpg differ diff --git a/notes-visualized.jpg b/notes-visualized.jpg old mode 100755 new mode 100644 index ba4a122d6b2..0df9ca7db09 Binary files a/notes-visualized.jpg and b/notes-visualized.jpg differ diff --git a/readme.md b/readme.md index fc012d14ba6..c8a5e06ce24 100755 --- a/readme.md +++ b/readme.md @@ -1,4 +1,4 @@ -# Deep Notes +# Deepak's Personal Wiki | Deep Notes Deployed at - [https://deepaksood619.github.io/](https://deepaksood619.github.io/) @@ -26,15 +26,15 @@ loc -------------------------------------------------------------------------------- Language Files Lines Blank Comment Code -------------------------------------------------------------------------------- - Markdown 2294 250112 78577 0 171535 - JSON 3 16782 0 0 16782 - JavaScript 3 241 26 87 128 + Markdown 2433 281547 84012 0 197535 + JSON 3 20089 0 0 20089 + JavaScript 3 247 26 87 134 + YAML 1 50 5 22 23 CSS 1 30 2 7 21 - YAML 1 49 5 23 21 Plain Text 2 3 0 0 3 Bourne Shell 1 3 0 1 2 -------------------------------------------------------------------------------- - Total 2305 267220 78610 118 188492 + Total 2444 301969 84045 117 217807 -------------------------------------------------------------------------------- ``` diff --git a/static/manifest.json b/static/manifest.json index ee139e493ce..e196b583d48 100755 --- a/static/manifest.json +++ b/static/manifest.json @@ -1,49 +1,57 @@ { - "short_name": "Deep Notes", - "name": "Deep Notes | Deepak's Personal Wiki", - "icons": [ - { - "src": "/img/icons-192.png", - "type": "image/png", - "sizes": "192x192" - }, - { - "src": "/img/icons-512.png", - "type": "image/png", - "sizes": "512x512" - } - ], - "id": "/?source=pwa", - "start_url": "/?source=pwa", - "background_color": "#2f8555", - "display": "standalone", - "scope": "/", - "theme_color": "#2f8555", - "description": "Deep Notes | Deepak's Personal Wiki", - "screenshots": [ - { - "src": "/img/screenshot1.jpg", - "type": "image/jpg", - "sizes": "540x720", - "form_factor": "narrow" - }, - { - "src": "/img/screenshot2.jpg", - "type": "image/jpg", - "sizes": "540x720", - "form_factor": "narrow" - }, - { - "src": "/img/screenshot3.jpg", - "type": "image/jpg", - "sizes": "720x540", - "form_factor": "wide" - }, - { - "src": "/img/screenshot4.jpg", - "type": "image/jpg", - "sizes": "540x720", - "form_factor": "narrow" - } - ] - } + "short_name": "Deep Notes", + "name": "Deep Notes | Deepak's Personal Wiki", + "icons": [ + { + "src": "/img/icons-192.png", + "type": "image/png", + "sizes": "192x192" + }, + { + "src": "/img/icons-512.png", + "type": "image/png", + "sizes": "512x512" + } + ], + "id": "/?source=pwa", + "start_url": "/?source=pwa", + "background_color": "#2f8555", + "display": "standalone", + "scope": "/", + "theme_color": "#2f8555", + "description": "Deep Notes | Deepak's Personal Wiki", + "screenshots": [ + { + "src": "/img/screenshot1.jpg", + "type": "image/jpg", + "sizes": "540x720", + "form_factor": "narrow" + }, + { + "src": "/img/screenshot2.jpg", + "type": "image/jpg", + "sizes": "540x720", + "form_factor": "narrow" + }, + { + "src": "/img/screenshot3.jpg", + "type": "image/jpg", + "sizes": "720x540", + "form_factor": "wide" + }, + { + "src": "/img/screenshot4.jpg", + "type": "image/jpg", + "sizes": "540x720", + "form_factor": "narrow" + } + ], + "categories": [ + "education", + "finance", + "productivity" + ], + "display_override": [ + "window-controls-overlay" + ] +}