Sharding was originally a technique used to optimize large databases. A given database would be partitioned horizontally based on a chosen criterion as a means to improve overall performance. Sharding in cryptocurrencies is similar in concept.
Due to the distributed, trustless nature of most blockchains, they are limited in the number of transactions per second they can process. Since every node on the network participates in verifying valid blocks, there is a limit to the number of transactions the network can process in a given time-frame. Not all cryptocurrency implementations are equal in this regard. Some are faster than others (e.g. Bitcoin can manage around seven transaction per second, while Litecoin can handle 56 transactions per second). There are a few ways to increase the transaction speed of a given implementation, such as increasing the size of each block processed, but these can only go so far in improving transaction speed.
Sharding can address the transaction speed limitations by dividing the blockchain into separate, smaller blockchain networks. So, as an example, if your blockchain has a maximum of 20 transactions per second and you are at that limit you can divide it into two smaller implementations that now are running at 10 transactions per second. You now have two "shards" of the original. This division can be done using any number of criteria such as geography (e.g. transactions originating in Europe, North America, Asia, etc) or product lines (e.g. transactions involving electronics, food, automobiles, etc).
The drawback to sharding a given blockchain is that the smaller implementations cannot talk to each other. A transaction done on one cannot be used on a different shard. So, if you have a Bitcoin on shard "A" you cannot use that same Bitcoin on shard "B". Some kind of middleman would be needed to move something from one network to another, and one might possibly lose the trustless nature that is a crucial feature of cryptocurrencies.